🧠 Second Brain

Search

Search IconIcon to open search

Snapshotting

Last updated Feb 9, 2024

Contrary to SCD2, which evolves over time, snapshotting captures the state of data at specific time intervals, much like Materialized Views.

Similar capabilities can also be achieved with Time Travel in Data Lake Table Formats.

In his insightful work, Maxime Beauchemin advocates for the use of partitions in snapshotting. He suggests maintaining two separate tables for each dimension: dimension and dimension_history. He emphasizes:

“The most recent partition is especially valuable as it reflects the current state. Employing table partitioning strategies and creating a view that points directly to the latest partition ensures easy and optimal data access. Effective naming conventions are crucial here, as exemplified by core.user_history and core.user.”


Origin:
References:
Created 2023-04-17