Skip to content

This project demonstrates a light and straightforward Datavault raw model generator, provided by Acceliance

Notifications You must be signed in to change notification settings

acceliance/GovernanceForDatavault

Repository files navigation

logo acc amp gauche 2400

Governance for Datavault on Snowflake by Acceliance

We are pleased to provide you with a practice for governing the Datavault model including automatic raw model generator

This Datavault generator is freely brought to you by Acceliance, feel free to reuse it and adapt it for your own needs

Before starting

Install Modelio

Please run the installation steps explained in the Github repo

Next, open Modelio, switch workspace to current Github repo and then open project

Governance in Real Life

Before generating any production Model, we must set several things

The Retail domain model

The Modelio project used for illustrating Governance for Datavault is Retail domain, involving sending products into physical stores generating tickets.

The Model is present in the project as follows: Retail Model sprint1

Governing the Agile Sprints

Sprints are governed at the single attribute (inside the objet) level

We will set the sprint 1 for retail Mesh/Product as the visual model diagram (previous screenshot), click the right mouse button on the tree node (as selected in the previous screenshot)

Setting the Sprint1 for retail perimeter

The diagram shows the exact expected output so we will use the usage cartography features of Acceliance practice:

Select the diagram and right-click the mouse button

When opening the Excel file (you will find it in the cartography folder of the project) you will notice that all the data from the diagram appears into the list Open the Excel file for cartography is outputted

Now set Sprint1 for the whole column of Data Mesh/Product Retail Set the sprint

After saving and closing the Excel file, now import the usage cartography back into the model referential Choose the import cartography option

Click on imort button

The importation is finished as the log confirms

When selecting any attribute in the Model diagram, you will notice the the sprint is set into the usage cartography stereotype The usage stereotype

What we have seen as to Governance

The perimeters of Data product delivery is managed by setting usages and sprint labels

Now we will see how to effectively generate the output based on this governance practice

Datavault and Automatic Generation for Snowflake

Scripting the Datavault generation

The Model contains all the metadata needed to generate code for actual deployment on the Data platform

Modelio uses Python language to script against the Model

Notice that the script is based on the usage cartography as seen in the <<bookmark-govsprints,Governance for sprints> part of the practice Usage in the script

Invoking the script

The script is invoked from Modelio as this gov dv 010

Select the generate_datavault_for_sprint.py script to run gov dv 011

The DDL script for Snowflake is generated into a dedicated folder suffixed with the today date gov dv 012

More on Governance

Documenting the Model

The Model can be annotated with extensive documentation, as with cartography, the contents is governed using Excel (must be reinjectd back into the model)

The data modelling workshops wtih the functional experts is the best moment to capture business definition on the artefacts and it is possible to keep this information and propagate it to the enterprise ecosystem

To export the dictionary, use the following command gov dv 013

Then select the proper options gov dv 014

Alignement with the Generation

The documentation is automatically aligned with the deploymebnt into the Data platform

From capture gov dv 015

To deployment gov dv 016

Outputting for Enterprise Data Governance

The documentation can be outputted to any Data Governance platform such as Collibra, DataGalaxy or any other tool using Pyton scripting

Summarising on Governance

We have seen the following items regarding Datavault Governance:

  • Viewing the Model in a fully Business & graphical manner (with no technical considerations at all)

    • Opening the opportunity to workshop/communicate with non-IT people

  • Governing the sprints at the level of the single data inside the object/concept

    • Using the graphical view to produce the sprint (WYSIWYG)

  • Using the Model metadata to generate physical implementation perfectly aligned with the Business view

    • Thus creating an automated continuum of Architecture

    • Using the UML stereotypes to control aspects of the physical generation

What’s next ?

There are missing functionalitties such as:

  • Transaction Links generation (those links are used to optimize sql object volumetry, the technical pattern is that the object must have only relations of cadinality Many To One, in the case of the Retail Model, the ProductBuying object is a right candidate)

  • Multi Satellite generation (this feature can be implemeted using stereotypes and then generating several satellites for one Model object)

  • You may adapt the script for other platform such as Postgres (modify the type mapping table in the Python script)

All of the improvement can be added into the Python script

How to support us ?

If you like our works and freely reuse them for your own projects, please give testimony on our LinkedIn company page

About

This project demonstrates a light and straightforward Datavault raw model generator, provided by Acceliance

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages