🧠 Second Brain

Search

Search IconIcon to open search

Granularity

Last updated Aug 9, 2024

In Dimensional Modeling, defining the grain is a pivotal step. This process involves determining what each row in a fact table signifies, laying down a foundational framework for the entire design.

Before diving into selecting dimensions or facts, it’s critical to establish the grain. The reason? Every dimension or fact must be in harmony with the grain’s definition. Such alignment is key to achieving consistency in all dimensional designs, enhancing both the performance and usability of BI Applications.

When we talk about atomic grain, we refer to the most detailed level at which a business process records data. Initiating your design with data at the atomic grain is wise, as it equips you to handle a wide range of unpredictable user queries effectively.

Summary grains, which aggregate data at a higher level (often called a rollup), do optimize performance. However, they’re built on presumptions about frequent business queries. Remember, each proposed grain for a fact table necessitates a distinct physical table. Avoid mixing different grains within the same fact table to maintain clarity and efficiency.

# Normalization

In comparison to fact tables and dimensions, we often encounter a denormalized table. This table encompasses all dimensions needed for a specific use case but with many repeated entries. For instance, dimensions like geography or customer are duplicated, unlike in a traditional relational database approach (e.g., Third Normal Form) where dimensions are linked via Foreign Keys.

Take the example of a One Big Table: it may feature hundreds of columns. This contrasts with a relational setup, where you’d normalize data into separate tables. Thanks to its columnar-oriented architecture, adding new columns in a One Big Table is straightforward. This approach is also prevalent in Modern OLAP, like normalizing nested formats (structs, Arrays, JSON) within a table to achieve finer granularity.

Explore further in Normalization vs Denormalization.

# Granularity and Cardinality

In the realm of data engineering, the concept of Granularity is intrinsically linked to cardinality, particularly when discussing database design and business intelligence. Understanding this relationship is crucial for effective data management and analysis.

Discover more in One Big Table in the DEDP Book.


Origin: Grain | Kimball Dimensional Modeling Techniques
References:
Created 2022-09-09