Blog | Data & Analytics

The key to building Data Vaults fast: patterns and repeatability

A repeatable pattern is automatable and automation accelerates the process.

Given the right circumstances data vault can be a great approach to data warehousing, (check out Micron and Vodafone Netherlands) of course, as with everything, there is never a free lunch. In a data vault, by design, there are more tables and more relationships than other models and that means it is going to take time and effort to build.

If that sounds challenging, you’re right it is. So here’s some good news. Once you get started, you quickly realise there is a pattern to the model. If there’s a pattern, that means it is repeatable and therefore automatable.

Repeatability is good. It means once you’ve got your head around it, the process soon accelerates.

Think of the data vault as a multi-floor building where each floor is the same as the one under it. The hard(est) work goes in on the foundations, with the initial creation of objects, tables and relationships.

Each subsequent level of the ‘building’ repeats the pattern within the foundations, making it considerably easier and faster as the layers are added. There may be 3 or 4 times as many tables, but there is a pattern to it. They all have the same hash key, date and record source metadata columns. These characteristics are built the same no matter the table type.

You’re effectively investing the most time up-front and you can expect to make the gains associated with data vault further down the track.


The DIY Trap


Which brings us to the DIY trap. Recognition of the pattern based approach leads smart devs to create tools to automate parts of the development. This usually starts out with some small parts but to make it worthwhile the tools must do more.

What’s bad about that? Well, all the time you spend building the tool is time you’re not spending on building the data vault – the thing that will deliver the value. Then there’s who supports your new tool when something goes wrong (it’s you).

The advice here is simple: when starting a data vault implementation tools like WhereScape’s Data Vault Express can step in and automate a lot of the process and save you time and money. That’s because RED and Data Vault Express are built with rules-based development and frameworks in mind to accelerate development and increase consistency and supportability. What’s more, if you’re new to data vaulting those built-in frameworks will guide you and your team, compressing the learning curve and reducing the likelihood of making mistakes or taking a wrong turn.


More on Data Vault Express


Data Vault Express is a combination of WhereScape 3D and WhereScape RED used in unison to automate the build of a Data Vault 2.0 compliant data vault (And it’s even endorsed by Dan Linstedt – yes, the guy who invented Data Vault).

WhereScape 3D discovers and profiles your data source and its model conversion rules flexibly model the data vault to Data Vault 2.0 best practice including the underlying load and staging tables required for your data vault model.

Package and deploy that model from WhereScape 3D into WhereScape RED where the data vault-specific objects can be managed and scheduled for processing at the required interval.

Sound easy? Well, certainly easier. Data Vault Express will accelerate development, reduce complexity and increase manageability of your data vault.

New call-to-action

Subscribe to the blog