Search

Search IconIcon to open search

Data Mesh: Decentralizing and Unifying Data Resources

Last updated by Simon Späti

In the evolving landscape of data management, Data Meshes emerge as a crucial concept, trying to bridge the gaps between isolated data teams. Their core value lies in fostering a shared understanding and usage of data across diverse teams within an organization.

By effectively interlinking platforms, Data Meshes facilitate seamless data transfer, enhancing the organizational workflow. This approach combats the typical disconnect in data handling, offering a solution that balances decentralized resources with a unifying common infrastructure. It empowers domain experts, granting them ownership and control over their data domains.

For deeper insights, consider exploring the foundational paper on this topic, the succinct explanation in What the Heck is a Data Mesh?! or the visually engaging Data Mesh Architecture. Practical applications and perspectives are well-articulated in Data Mesh in Practice.


A nice of the problem of a central data platform team | Share on LinkedIn by Ole Olesen-Bagneux

# History

The term data mesh was first defined by Zhamak Dehghani in 2019 while she was working as a principal consultant at the technology company Thoughtworks. Dehghani introduced the term in 2019 and then provided greater detail on its principles and logical architecture throughout 2020. The process was predicted to be a “big contender” for companies in 2022. Data meshes have been implemented by companies such as Zalando, Netflix, Intuit, VistaPrint, PayPal and others.

In 2022, Dehghani left Thoughtworks to found Nextdata Technologies to focus on decentralized data. Source

# Fleeting Thoughts

From the Netflix Technology blog:

I agree that naming is confusing, but it’s just an unfortunate timing: the development of “Data Mesh” (DM) platform at Netflix started around the same time, Zhamak Dehghani an first defined the term in the “Beyond the lake” talk in 2018. As we state, “we define Data Mesh as a general purpose data movement and processing platform for moving data between Netflix systems at scale”, nothing more, nothing less. RW Data Mesh — A Data Movement and Processing Platform @ Netflix by Netflix Technology Blog Netflix TechBlog

From Reddit - Dive into anything:

  • “Data Mesh: a Data Warehouse that has surpassed Dunbar’s number.”
  • “Data Lake: a collection of data that has reached its lowest point.”

More on Why data pipeline should not be outside of data product.

# Demystifying Data Mesh

  • Is Data Mesh essential for everyone? Probably not. For a critical perspective, see Behind the Hype: Why Data Mesh Is Not Right For You - YouTube.

    • Tweet
      • Another one by Max: “I’m with you. It feels like a bit of everything trendy served in a cocktail of buzzwords. I read Kimball and Inmon 2 decades ago, been a practitioner at many very modern data forward companies (FB, Airbnb, Lyft) , and I feel very disoriented reading about data mesh.
        • I don’t think its because I don’t get it, but because putting a label on the distributed chaos going on in these organizations doesn’t make it an intelligible concept. Tweet
  • Alternatives to Data Mesh include Software-Defined Asset by Dagster.
  • For a comprehensive overview, refer to Data Mesh Architecture and various other perspectives listed below.
  • The concept has its skeptics. For instance, a comment on Twitter expresses confusion and doubt about labeling the chaotic, distributed data environments in modern organizations.
  • The most practical insights into Data Mesh might be found in Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake - YouTube, though it may not fully capture the claimed paradigm shift.
    • It falls very short of the paradigm shift data mesh claims to be. Infra described there seemed median-ish on the data maturity curve at best compared to 100+ companies I connected within the context of my work on Airflow & Superset
  • Matthew Darwin’s insights on the Firebolt approach to architectures in Firebolt and Data Mesh | Firebolt are also noteworthy.

# Data Mesh is just a Microservice?

Is Data Mesh an adaptation of the Microservices architectural principles applied to data management?

Microservices vs Data Mesh

# Additional Resources

# Data Fabric

Gartner calls it Data Fabric due to their reasons to not call it Data Mesh.

More also on the differences on Gartner: Data fabric and data mesh: same or different:

  • The total cost to deliver either one may ultimately be similar relative to design and deployment. However, the more augmented data management capabilities included in a Data Fabric improve the cost model for ongoing improvement and maintenance.
  • Data Mesh and Data Fabric benefit from one another, either adapting to or leveraging best practices.
  • Both Data Fabric and Data Mesh materialized from mature data management practices and are based on over 50 years of data management technology advances.

# Meta Grid

Ole Olesen-Bagneux presented the Meta Grid as the 3rd wave of decentralization - the first being microservices, and the 2nd being data mesh.


References: RW How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh Reverse ETL
Last Modified: 2021-10-28
Created 2021-10-28