Data Modeling is changing
Especially with newer Data Engineering Approaches, tools land the landscape has drastically changed (see The 2023 MAD (Machine Learning, Artificial Intelligence & Data) Landscape).
Essentially, you can’t change ETL without modeling differently. Here are a few points that have been changed and will further change:
- Further denormalization for performance gains is mostly compensated with faster database engines or cloud solutions.
- Maintaining surrogate keys in dimensions can be tricky and not human-friendly as we prefer business keys.
- With the popularity of document storage and cheap blobs in cloud storage, it is becoming easier to create and develop database schemas dynamically without writing DML-statements.
- Systematically snapshotting dimensions compared to handling complex and maybe contra-intuitive Slowly Changing Dimension (Type 2) is a way to simplify track changes in a DWH. Is it also easy and relatively cheap to denormalize dimension attributes directly on the fact table to keep important information at the moment of the transaction?
- Conformed dimensions and conformance, in general, are extremely important in nowadays Data Warehouses and data environments. But to be more collaborative and work on the same objects it is a necessary trade-off to loosen it up.
- Not only are more working on the same project within data warehousing, but also more people from business and other departments getting more data-savvy than ever before. In that sense data needs to get more real-time rather than batch processing and precompute calculations, this can be done more ad-hoc with new fast technologies like Spark that ran complex jobs ad-hoc and on-demand.
See more on Babies and bathwater- Is Kimball still relevant.