π§ Second Brain
Search
Data Mesh: Decentralizing and Unifying Data Resources
In the evolving landscape of data management, Data Meshes emerge as a crucial concept, trying to bridge the gaps between isolated data teams. Their core value lies in fostering a shared understanding and usage of data across diverse teams within an organization.
By effectively interlinking platforms, Data Meshes facilitate seamless data transfer, enhancing the organizational workflow. This approach combats the typical disconnect in data handling, offering a solution that balances decentralized resources with a unifying common infrastructure. It empowers domain experts, granting them ownership and control over their data domains.
For deeper insights, consider exploring the foundational paper on this topic, the succinct explanation in What the Heck is a Data Mesh?! or the visually engaging Data Mesh Architecture. Practical applications and perspectives are well-articulated in Data Mesh in Practice.
A nice of the problem of a central data platform team |
Share on LinkedIn
# Fleeting Thoughts
- From the Monday Morning Podcast #137 with Brian Olsen, discussions emerge about the potential decline of Data Mesh post-hype. Brian Olsen compares its necessity to Kubernetes in operations, suggesting a service using YAML to define Data Contracts, incorporating governance aspects like legal data storage requirements.
- Dagster is mentioned as a promising tool that aligns with Data Mesh ideals through its implementation of Software-Defined Asset and Data Assets, automating data asset updates and their downstream representations.
- The conversation extends beyond Monolith Data, with Netflix clarifying their Data Mesh platform’s purpose as a scalable data movement and processing tool (source: RW Data Mesh β A Data Movement and Processing Platform @ Netflix by Netflix Technology Blog Netflix TechBlog)
- The interest in data meshes and Data Contracts is born out of related frustrations, and a desire to enforce some degree of consistency between, among other things, disconnected products and services. RW The Conglomerate - By Benn Stancil - benn.substack by Benn Stancil.
From the Netflix Technology blog:
I agree that naming is confusing, but it’s just an unfortunate timing: the development of “Data Mesh” (DM) platform at Netflix started around the same time, Zhamak Dehghani an first defined the term in the “Beyond the lake” talk in 2018. As we state, “we define Data Mesh as a general purpose data movement and processing platform for moving data between Netflix systems at scale”, nothing more, nothing less. RW Data Mesh β A Data Movement and Processing Platform @ Netflix by Netflix Technology Blog Netflix TechBlog
From Reddit - Dive into anything:
- “Data Mesh: a Data Warehouse that has surpassed Dunbar’s number.”
- “Data Lake: a collection of data that has reached its lowest point.”
More on Why data pipeline should not be outside of data product.
# Demystifying Data Mesh
- Is Data Mesh essential for everyone? Probably not. For a critical perspective, see
Behind the Hype: Why Data Mesh Is Not Right For You - YouTube.
Tweet- Another one by Max: “I’m with you. It feels like a bit of everything trendy served in a cocktail of buzzwords. I read Kimball and Inmon 2 decades ago, been a practitioner at many very modern data forward companies (FB, Airbnb, Lyft) , and I feel very disoriented reading about data mesh.”
- I don’t think its because I don’t get it, but because putting a label on the distributed chaos going on in these organizations doesn’t make it an intelligible concept. Tweet
- Another one by Max: “I’m with you. It feels like a bit of everything trendy served in a cocktail of buzzwords. I read Kimball and Inmon 2 decades ago, been a practitioner at many very modern data forward companies (FB, Airbnb, Lyft) , and I feel very disoriented reading about data mesh.”
- Alternatives to Data Mesh include Software-Defined Asset by Dagster.
- For a comprehensive overview, refer to Data Mesh Architecture and various other perspectives listed below.
- The concept has its skeptics. For instance, a comment on Twitter expresses confusion and doubt about labeling the chaotic, distributed data environments in modern organizations.
- The most practical insights into Data Mesh might be found in
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes Beyond the Data Lake - YouTube, though it may not fully capture the claimed paradigm shift.
- It falls very short of the paradigm shift data mesh claims to be. Infra described there seemed median-ish on the data maturity curve at best compared to 100+ companies I connected within the context of my work on Airflow & Superset
- Matthew Darwin’s insights on the Firebolt approach to architectures in Firebolt and Data Mesh | Firebolt are also noteworthy.
# Data Mesh is just a Microservice?
Is Data Mesh an adaptation of the Microservices architectural principles applied to data management?
# Additional Resources
- Data Mesh - Fad Or Fab? - Monte Carlo Data
- The Last Thing I’ll Ever Say About the Data Mesh - by Pedram Navid - data based
- RW What the Heck Is a Data Mesh! Chris Riccomini
- RW Data Mesh Overhyped, Misunderstood, and Useful!
- Data Mesh in Practice- How Europeβs Leading Online Platform for Fashion Goes Beyond the Data Lake
- Top Global Data Management Trends for 2022 | Astronomer - Astronomer
# Data Fabric
Gartner calls it Data Fabric due to their reasons to not call it Data Mesh.
More also on the differences on Gartner: Data fabric and data mesh: same or different:
- The total cost to deliver either one may ultimately be similar relative to design and deployment. However, the more augmented data management capabilities included in a Data Fabric improve the cost model for ongoing improvement and maintenance.
- Data Mesh and Data Fabric benefit from one another, either adapting to or leveraging best practices.
- Both Data Fabric and Data Mesh materialized from mature data management practices and are based on over 50 years of data management technology advances.
References: RW How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh Reverse ETL
Last Modified: 2021-10-28
Created 2021-10-28