🧠Second Brain
Search
Semantic Layer
A Semantic Layer acts as an intermediary, translating complex data into understandable business concepts for users. It bridges the gap between raw data in databases (like sales data with various attributes) and actionable insights (like revenue per store or popular brands). This layer helps business users access and interpret data using familiar terms, without needing deep technical knowledge.
The Semantic Layer serves as a translator between various data presentation layers (Business Intelligence, Notebooks, data apps) and data sources. It integrates data sources, models metrics, and connects with data consumers, translating metrics into languages like SQL, REST, or GraphQL.
A Semantic Layer defines key business metrics (like “active” users or “paying” customers) once company-wide, eliminating inconsistencies across different tools. This centralization and standardization of definitions ensure uniform understanding and reporting across the organization.
The Metrics Layer, a subset of the Semantic Layer, was first introduced by SAP BO in 1991. The Rise of the Semantic Layer provides more on this. Kimball Group defined it in 2013.
Headless BI vs. Semantic Layer
Headless BI, often used interchangeably with a Semantic Layer, can be considered a practical implementation of the latter. The term’s origins are traced to a LinkedIn Comment.
Insight from Maxime Beauchemin ( Podcast)
The Semantic Layer is like a restaurant menu: you know what you’re ordering, but not how it’s made. This layer maps metrics to physical tables and can range from minimal modifications (thin layer) to encompassing transformation logic atop physical tables (thick layer).
# Why a Semantic Layer
In my previous company, we developed an Analytics API, similar to what Cube does, but with orchestration as a key component. We switched from SSAS to Druid, a modern OLAP solution, to handle diverse business metrics queries from Tableau, Notebooks, and our Web App. The inability of Druid to store queries as SSAS led us to create this Semantic Layer or Analytics API. The Semantic Layer’s real advantage lies in its ability to define and automate metrics as code, offering a universal, open-source, or open-standard approach to handling business metrics.
More of my thoughts are elaborated in Building an Analytics API with GraphQL and in The Rise of the Semantic Layer: Metrics On-The-Fly. More Slack Convo.
# Different Types of Semantic Layers
From What Is a Semantic Layer? | GoodData:
- Semantic layer in a data warehouse:Â The main purpose of a data warehouse is to provide a centralized data source for the whole organization. It is designed to be a single source of truth for different departments, user groups, and use cases. The structure of data in the warehouse can be complex and technical, which makes it difficult for users to access the information they need. As a result, business users often extract portions of this data into BI tools, creating localized semantic layers that can contribute to semantic layer spread.
- Semantic layer within data pipelines:Â When constructing data pipelines (the process of adding data from various sources to a data warehouse), data engineers input a semantic layer in the code. This layer helps to name and organize the different parts of the data models, such as tables and attributes.
- Semantic layer in Business Intelligence (BI) and data analytics:Â This type of semantic layer defines business concepts and the relationships between them. It also defines metrics and calculations that can be used for analysis and reporting through different users and user groups for specific business use cases.
- Universal semantic layer: There is a connection between raw data and the different tools for users to analyze their data (such as BI and AI/ML tools, management tools, and business applications). A universal semantic layer doesn’t focus on a specific business use case and needs to cover company-wide requirements.
# Semantic Layer Tools
- Cube
- dbt Semantic Layer (MetricFlow: now part of dbt)
- MetriQL
- GoodData
- AtScale with their Universal Semantic Layer
# Related
# Similar to MVC
?
Idea
Could the Semantic Layer be likened to the MVC (Model View Controller) model? In Cube (OLAP), views are created similar to DB views, with the model and component handling the rest. This is akin to Convergent Evolution between Semantic Layer and MVC.
# Knowledge Graphs, LLMs with Semantic Layer
- Semantic Layer as the Data Interface for LLMs
- Natural Language for SL
# Why not define Measures within SQL?
This is what Julian Hyde brought up in his talk Extending SQL for analytics, similar to what MDX Studio did to SSAS.
# History
The Evolution of the Semantic Layer (and related for context MDM, dbt/Jinja):
- 1991: SAP BusinessObjects Universe and BI semantic layer
- 1997: SSAS and MDX with their logical modeling layer with MDX, define business metrics and dimensions in a structured way (1997)
- 2008: Master Data Management (MDM) (with MDS from Microsoft in 2008)Â
- Business entities (customers, products, locations): MDM focuses on core business entities (customers, products, locations) while semantic layers typically focus on metrics and dimensions.
- Single Source of Truth: MDM with master data records and SL for Metrics
- Data Governance: Both approaches involve managing and governing data definitions
- 2013: Kimball discussed the concept of a semantic layer in #158 Making Sense of the Semantic LayerÂ
- 2016: Maturing BI tools with an integrated semantic layer such as Tableau, TARGIT, PowerBI, Apache Superset, etc. have their own metrics layer definition
- 2018: Jinja templates and dbt eroding the transformation layer into a semantic layer
- Not by definition, but the dbt declarative SQL definitions, defining all your DWH, in a way is an early semantic layer. In a way creating single source of truth (although potentially many different 😅).
- Like if you think what the old SAP BO Universe was, it was a logical model of SQL definitions. In a way, dbt definitions are the same. You do not have the visual designer, except you run dbt docs. That’s at least my thought and how it relates to the history overall.
- I think the line is fine. You can define measures and dimensions in dbt as SQL and add stuff with Jinja, but maybe too far stretched to call it semantics. BUT, it is declarative with SQL :) Full Discussion
- 2019: Looker and LookML popularized as the first real semantic layer
- 2022: Modern Semantic Layer, Metric Layer or Headless BI tools such as MetriQL, MetricFlow, Minerva, dbt arose with the explosion of data tools (BI tools, notebooks, spreadsheets, machine learning models, data apps, reverse ETL, …)
more on The Rise of the Semantic Layer | ssp.sh
# Other Resources
# Pedram Navid
Post LinkedIn / Pedram Navid on LinkedIn: #dbt #metrics | 38 comments: While dbt is building a metrics layer, the question still remains whether a metrics layer outside of BI will ever gain wide enough adoption. Jacob Matson rightly points out that Looker, Thoughtspot, Power BI, and Transform all have a metrics layer tightly integrated within BI, and they are good enough.
The challenges dbt has are that its implementation is pretty bad (no one wants to write Jinja Template in yaml), it lacks features critical for it to be useful (like joins), and it’s not clear that we need widespread access to metrics across tools outside of BI.
# Michael Driscoll
replies on The metrics layer may not actually need to be a layer, it could get baked into the SQL standard. (Extending SQL for analytics)
Databases could implement it. And every BI tool could query metrics directly.
Metrics layers are just aggregate expressions with some metadata.
Let them live in SQL.
If DuckDB Labs can introduce ‘GROUP BY ALL’ and proclaim that ‘FROM foo;’ is valid SQL, surely they could bring us aggregate awareness too.
# Artyom Keydunov & Pavel Tiunov - Cube
Semantic Layer and its relation to MVC (Model View Controller) pattern, popularized in Ruby On Rails.
The concept of a Semantic Layer shares similarities with the MVC (Model-View-Controller) model, particularly as popularized by Ruby on Rails through its Active Record pattern. In a conversation with Artyom Keydunov & Pavel Tiunov from Cube.dev, Artyom drew parallels between the two:
- Just as the MVC model focuses on decoupling data, the Cube Semantic Layer serves a similar purpose. In Cube, this decoupling is evident in how they create views, akin to database views, while the model and component handle the rest.
- The Semantic Layer can be likened to the Active Record in Ruby on Rails. Active Record is an implementation of the ORM (Object-Relational Mapping) pattern, which abstracts and simplifies database interactions. Similarly, the Semantic Layer abstracts complex data structures, making them more understandable and accessible to business users.
- At its core, the Semantic Layer represents the Logical Data Model in Data Modeling, serving as an intermediary between raw data and its representation to end-users.
This comparison underscores the universality of certain design patterns across different domains and technologies. Whether it’s web development with Ruby on Rails or data engineering with tools like Cube, the principles of abstraction, simplification, and decoupling remain consistent.
# More perspectives
- A semantic layer represents business data in a way that end-users can access using common business terms. demistify metric
- The Metrics Layer, synonymous with a Semantic Layer, was previously just random queries in BI tools. RW Deep Dive What the Heck Is the Metrics Layer
Explore more about the Semantic Layer:
- Down the Semantic Rabbit Hole
- The Missing Piece of the Modern Data StackÂ
- Deep Dive: What The Heck Is the Metrics Layer
- The Great Data Debate by Atlan
- The Metrics Layer has Growing up to do
- The Universal Semantic Layer, More Important Than Ever
- Demystifying the Metrics Store and Semantic Layer
I wrote a deep dive into The Rise of the Semantic Layer | ssp.sh, in case you want to know more.
Origin: Metrics Layer
References: The Rise of the Semantic Layer
Created 2022-09-29