đź§  Second Brain

Search

Search IconIcon to open search

Data Modeling

Last updated Sep 13, 2023

Data Modeling is as much about Data Engineering Architecture as it is about modeling the data only. Therefore besides the below links, many approaches and common architecture you can find in Data Engineering Architecture.

It’s getting more about language than really modeling, Shane Gibson says on Making Data Modeling Accessible. For example, a Data Scientist speaks Wide Tables, a Data engineer talks about facts and dimensions, etc., it’s what I call the different levels of data modeling in Data Modeling – The Unsung Hero of Data Engineering- An Introduction to Data Modeling (Part 1).


Nice illustration how different modeling techniques work | Source: GitHub - Data-Engineer-Camp/dbt-dimensional-modelling: Step-by-step tutorial on building a Kimball dimensional model with dbt

# Different Levels

How do you think about different levels of modeling? Generally, when I started (20 years ago) it was common to choose between Inmon and Kimball. But today, there are so many layers, levels, and approaches. Did you find a good way of separating or naming the different “levels” (still not sure about levels) to make it clear what is meant? Below I collected a list of what I think so far (I also wrote extensively about, in case of interest).

LinkedIn Post and Discussion and dbt Slack. Links (from post): Data Model Matrix.

# (Design) Patterns

Common approaches are well explained here:

others

# Data Modeling is changing

# Tools

# Frameworks

# Difference to Dimensional Modeling

There is more than dimensional modeling:

# Data Modeling part of Data Engineering?

Data modeling, incredibly Dimensional Modeling with defining facts and dimensions, is a big thing for a data engineer, IMO. It would help if you asked vital questions to optimize for data consumers. Do you want to drill down the different products? Daily or monthly enough —keywords granularity and rollup.

It also lets you think about Big-O implications regarding how often you touch and transfer data. I’d recommend the old  Data Warehouse Toolkit from Ralph Kimball, which initiated many of these concepts and is still applicable today. Mostly it’s not done in the beginning, but as soon as you get bigger, you wish you had done more :)

Links:


Origin:
References:
Created 2022-09-24