🧠 Second Brain

Search

Search IconIcon to open search

Semantic Layer

Last updated Nov 28, 2024

A Semantic Layer acts as an intermediary, translating complex data into understandable business concepts for users. It bridges the gap between raw data in databases (like sales data with various attributes) and actionable insights (like revenue per store or popular brands). This layer helps business users access and interpret data using familiar terms, without needing deep technical knowledge.

The Semantic Layer serves as a translator between various data presentation layers (Business Intelligence, Notebooks, data apps) and data sources. It integrates data sources, models metrics, and connects with data consumers, translating metrics into languages like SQL, REST, or GraphQL.

A Semantic Layer defines key business metrics (like “active” users or “paying” customers) once company-wide, eliminating inconsistencies across different tools. This centralization and standardization of definitions ensure uniform understanding and reporting across the organization.

The Metrics Layer, a subset of the Semantic Layer, was first introduced by SAP BO in 1991. The Rise of the Semantic Layer provides more on this. Kimball Group defined it in 2013.

Headless BI vs. Semantic Layer

Headless BI, often used interchangeably with a Semantic Layer, can be considered a practical implementation of the latter. The term’s origins are traced to a LinkedIn Comment.

Insight from Maxime Beauchemin ( Podcast)

The Semantic Layer is like a restaurant menu: you know what you’re ordering, but not how it’s made. This layer maps metrics to physical tables and can range from minimal modifications (thin layer) to encompassing transformation logic atop physical tables (thick layer).

# Why a Semantic Layer

In my previous company, we developed an Analytics API, similar to what Cube does, but with orchestration as a key component. We switched from SSAS to Druid, a modern OLAP solution, to handle diverse business metrics queries from Tableau, Notebooks, and our Web App. The inability of Druid to store queries as SSAS led us to create this Semantic Layer or Analytics API. The Semantic Layer’s real advantage lies in its ability to define and automate metrics as code, offering a universal, open-source, or open-standard approach to handling business metrics.

More of my thoughts are elaborated in Building an Analytics API with GraphQL and in The Rise of the Semantic Layer: Metrics On-The-Fly. More Slack Convo.

# Different Types of Semantic Layers

From What Is a Semantic Layer? | GoodData:

  • Semantic layer in a data warehouse: The main purpose of a data warehouse is to provide a centralized data source for the whole organization. It is designed to be a single source of truth for different departments, user groups, and use cases. The structure of data in the warehouse can be complex and technical, which makes it difficult for users to access the information they need. As a result, business users often extract portions of this data into BI tools, creating localized semantic layers that can contribute to semantic layer spread.
  • Semantic layer within data pipelines: When constructing data pipelines (the process of adding data from various sources to a data warehouse), data engineers input a semantic layer in the code. This layer helps to name and organize the different parts of the data models, such as tables and attributes.
  • Semantic layer in Business Intelligence (BI) and data analytics: This type of semantic layer defines business concepts and the relationships between them. It also defines metrics and calculations that can be used for analysis and reporting through different users and user groups for specific business use cases.
  • Universal semantic layer: There is a connection between raw data and the different tools for users to analyze their data (such as BI and AI/ML tools, management tools, and business applications). A universal semantic layer doesn’t focus on a specific business use case and needs to cover company-wide requirements.

# Semantic Layer Tools

# Similar to MVC?

Idea
Could the Semantic Layer be likened to the MVC (Model View Controller) model? In Cube (OLAP), views are created similar to DB views, with the model and component handling the rest. This is akin to Convergent Evolution between Semantic Layer and MVC.

# Knowledge Graphs, LLMs with Semantic Layer

# Why not define Measures within SQL?

This is what Julian Hyde brought up in his talk Extending SQL for analytics, similar to what MDX Studio did to SSAS.

# History

The Evolution of the Semantic Layer (and related for context MDM, dbt/Jinja):

more on The Rise of the Semantic Layer | ssp.sh

# Other Resources

# Pedram Navid

Post LinkedIn / Pedram Navid on LinkedIn: #dbt #metrics | 38 comments: While dbt is building a metrics layer, the question still remains whether a metrics layer outside of BI will ever gain wide enough adoption. Jacob Matson rightly points out that Looker, Thoughtspot, Power BI, and Transform all have a metrics layer tightly integrated within BI, and they are good enough.

The challenges dbt has are that its implementation is pretty bad (no one wants to write Jinja Template in yaml), it lacks features critical for it to be useful (like joins), and it’s not clear that we need widespread access to metrics across tools outside of BI.

# Michael Driscoll

replies on The metrics layer may not actually need to be a layer, it could get baked into the SQL standard. (Extending SQL for analytics)

Databases could implement it. And every BI tool could query metrics directly.

Metrics layers are just aggregate expressions with some metadata.

Let them live in SQL.

If  DuckDB Labs can introduce ‘GROUP BY ALL’ and proclaim that ‘FROM foo;’ is valid SQL, surely they could bring us aggregate awareness too.

# Artyom Keydunov & Pavel Tiunov - Cube

Semantic Layer and its relation to MVC (Model View Controller) pattern, popularized in Ruby On Rails.

The concept of a Semantic Layer shares similarities with the MVC (Model-View-Controller) model, particularly as popularized by Ruby on Rails through its Active Record pattern. In a conversation with Artyom Keydunov & Pavel Tiunov from Cube.dev, Artyom drew parallels between the two:

This comparison underscores the universality of certain design patterns across different domains and technologies. Whether it’s web development with Ruby on Rails or data engineering with tools like Cube, the principles of abstraction, simplification, and decoupling remain consistent.

# More perspectives

Explore more about the Semantic Layer:

I wrote a deep dive into The Rise of the Semantic Layer | ssp.sh, in case you want to know more.


Origin: Metrics Layer
References: The Rise of the Semantic Layer
Created 2022-09-29