Search

Search IconIcon to open search

Data Mesh: Decentralizing and Unifying Data Resources

Last updatedUpdated: by Simon Späti · CreatedCreated:

In the evolving landscape of data management, Data Meshes emerge as a crucial concept, trying to bridge the gaps between isolated data teams. Their core value lies in fostering a shared understanding and usage of data across diverse teams within an organization.

By effectively interlinking platforms, Data Meshes facilitate seamless data transfer, enhancing the organizational workflow. This approach combats the typical disconnect in data handling, offering a solution that balances decentralized resources with a unifying common infrastructure. It empowers domain experts, granting them ownership and control over their data domains.

For deeper insights, consider exploring the foundational paper on this topic, the succinct explanation in What the Heck is a Data Mesh?! or the visually engaging Data Mesh Architecture. Practical applications and perspectives are well-articulated in Data Mesh in Practice.


A nice of the problem of a central data platform team | Share on LinkedIn by Ole Olesen-Bagneux

# History

The term data mesh was first defined by Zhamak Dehghani in 2019 while she was working as a principal consultant at the technology company Thoughtworks. Dehghani introduced the term in 2019 and then provided greater detail on its principles and logical architecture throughout 2020. The process was predicted to be a “big contender” for companies in 2022. Data meshes have been implemented by companies such as Zalando, Netflix, Intuit, VistaPrint, PayPal and others.

In 2022, Dehghani left Thoughtworks to found Nextdata Technologies to focus on decentralized data. Source

# Fleeting Thoughts

From the Netflix Technology blog:

I agree that naming is confusing, but it’s just an unfortunate timing: the development of “Data Mesh” (DM) platform at Netflix started around the same time, Zhamak Dehghani an first defined the term in the “Beyond the lake” talk in 2018. As we state, “we define Data Mesh as a general purpose data movement and processing platform for moving data between Netflix systems at scale”, nothing more, nothing less. RW Data Mesh — A Data Movement and Processing Platform @ Netflix by Netflix Technology Blog Netflix TechBlog

From Reddit - Dive into anything:

  • “Data Mesh: a Data Warehouse that has surpassed Dunbar’s number.”
  • “Data Lake: a collection of data that has reached its lowest point.”

More on Why data pipeline should not be outside of data product.

# Demystifying Data Mesh

  • Is Data Mesh essential for everyone? Probably not. For a critical perspective, see Behind the Hype: Why Data Mesh Is Not Right For You - YouTube.

    • Tweet
      • Another one by Max: “I’m with you. It feels like a bit of everything trendy served in a cocktail of buzzwords. I read Kimball and Inmon 2 decades ago, been a practitioner at many very modern data forward companies (FB, Airbnb, Lyft) , and I feel very disoriented reading about data mesh.
        • I don’t think its because I don’t get it, but because putting a label on the distributed chaos going on in these organizations doesn’t make it an intelligible concept. Tweet
  • Alternatives to Data Mesh include Software-Defined Asset by Dagster.
  • For a comprehensive overview, refer to Data Mesh Architecture and various other perspectives listed below.
  • The concept has its skeptics. For instance, a comment on Twitter expresses confusion and doubt about labeling the chaotic, distributed data environments in modern organizations.
  • The most practical insights into Data Mesh might be found in Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake - YouTube, though it may not fully capture the claimed paradigm shift.
    • It falls very short of the paradigm shift data mesh claims to be. Infra described there seemed median-ish on the data maturity curve at best compared to 100+ companies I connected within the context of my work on Airflow & Superset
  • Matthew Darwin’s insights on the Firebolt approach to architectures in Firebolt and Data Mesh | Firebolt are also noteworthy.

# Demistify in 2025

Create insights from Reddit on The Data Mesh Hangover Reality Check in 2025:

I’ve worked across a bunch of companies and talked to way too many people about this stuff, and honestly the pattern is pretty obvious: if you don’t have a solid central platform and governance team, things get messy fast.

Company 1 (Consulting): Everything was random R / Python scripts tied together with Airflow, plus Databricks notebooks floating around.

Result: No standards, no ownership, just chaos.

Company 2 (E-commerce, “data-mesh-ish” but no platform team): BigQuery + Airflow with almost zero guardrails.

Result: Still chaos. No lineage, no visibility.

Credit-card example: central team gave a DS team access to sensitive data only for fraud modeling. They built a derived table, and because there were no permission controls, that table suddenly became visible to everyone in the company. No one caught it for weeks.

Company 3 (Large mobile-gaming company): Strong central platform team + distributed product analysts.

Result: Honestly the smoothest setup I’ve seen. Even less technical analysts shipped fast because the platform team handled the heavy lifting.

Company 4 (Small gaming studio): 1–2 engineers built the whole thing on Prefect + dbt and enforced strict rules manually.

Result: Super slow, pipelines broke constantly, everything was fragile.

Company 5 (Neo-bank): Huge data team, started doing full Data Mesh during COVID. Each domain ran its own infra and pipelines.

Result: Now they’re trying to re-centralize everything, and it’s incredibly painful. Every domain has different tools, different workflows, different security assumptions. They literally said they wish they had standardized the platform from day one.

So yeah, from everything I’ve seen:

Having a strong central platform/governance team that sets the standards and provides the tooling, and then letting domains build data products on top of that, is the only setup that doesn’t blow up over time.

# Data Mesh is just a Microservice?

Is Data Mesh an adaptation of the Microservices architectural principles applied to data management?

Microservices vs Data Mesh

# Additional Resources

# Data Fabric

Gartner calls it Data Fabric due to their reasons to not call it Data Mesh.

More also on the differences on Gartner: Data fabric and data mesh: same or different:

  • The total cost to deliver either one may ultimately be similar relative to design and deployment. However, the more augmented data management capabilities included in a Data Fabric improve the cost model for ongoing improvement and maintenance.
  • Data Mesh and Data Fabric benefit from one another, either adapting to or leveraging best practices.
  • Both Data Fabric and Data Mesh materialized from mature data management practices and are based on over 50 years of data management technology advances.

# Meta Grid

Ole Olesen-Bagneux presented the Meta Grid as the 3rd wave of decentralization - the first being microservices, and the 2nd being data mesh.

# Examples


References: RW How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh Reverse ETL
Last Modified: 2021-10-28