🧠Second Brain
Search
Cardinality
In the realm of data models, cardinality plays a pivotal role, referring to the uniqueness of data values within a specific column of a database table.
- High cardinality implies a column is populated with a plethora of unique values, such as user IDs.
- Low Cardinality indicates data in a column are predominantly repetitive, like a column containing mere “Yes” and “No” answers.
# Example
Cardinality is crucial in managing the vast array of potential combinations in data.
Consider we’re crafting dimensions in a dimensional model. Each dimension bears its own columns and distinct cardinality, denoting a unique count of values. Envision three dimensions in a model; time with 12 values (months), location with ten values (various stores), and product category with five values.
The total combinatorial possibility with these dimensions is their cardinalities’ product. It would calculate to 12 (time) × 10 (location) × 5 (product category) = 600
possible combinations. The complexity escalates with the addition of multiple columns and numerous dimensions. To illustrate, we initially had 12 × 10 = 120
combinations. With the inclusion of the product category, we witness a surge to 480 additional combinations.
Herein lies the limitation of a traditional OLAP cube; the bottleneck being the impractical pre-calculation time that extends beyond reasonable limits (e.g., overnight).
Integrating Many-to-Many Relationships further amplifies this complexity. A measure might relate to a dimension via other Bridge Tables, expanding the potential combination spectrum. The growth trajectory is inherently tied to the specific design and cardinality of the cube’s dimensions and measures.
# Difference to Granularity
Granularity, in contrast, pertains to the precision or depth in a dataset, particularly in a Fact Table within a Dimensional Model (like within a data warehouse). It delineates what each dataset or database table record signifies.
The primary distinction between cardinality and granularity lies in their focal points:
- Cardinality zeroes in on the nature of individual values within a column, underscoring their uniqueness or repetitiveness.
- Granularity revolves around the overall extent or depth of the data in a table or dataset, concentrating on the level of detail or summarization inherent in the data.
Check more on Big-O, and Granularity
Origin:
References: Granularity
Created 2023-12-19