Search

Search IconIcon to open search

Data Engineering Acquisitions

Last updated by Simon Späti

Consolidation in the Data Engineering market is happening quickly. Tools from the Modern Data Stack get unified into bigger Data Platforms. This note highlights the latest acquisitions across data engineering. It serves as an overview of the latest consolidations.

Find attached the acquisition overview from 2022 to today.

gantt
    title Data Engineering Acquisitions Timeline 2022-2025
    dateFormat YYYY-MM
    axisFormat %b %Y
    
    section Fivetran + dbt
    dbt → MetricFlow                     :milestone, 2023-02, 0d
    dbt → SDF                            :milestone, 2025-01, 0d
    Fivetran → Census                    :milestone, 2025-05, 0d
    Fivetran → SQLMesh                   :milestone, 2025-09, 0d
    Fivetran → dbt                       :milestone, 2025-10, 0d
    
    section Databricks 
    Okera                   :milestone, 2023-05, 0d
    MosaicML                :milestone, 2023-06, 0d
    Arcion                  :milestone, 2023-10, 0d
    Tabular                 :milestone, 2024-06, 0d
    Neon                    :milestone, 2025-05, 0d
    
    section Snowflake 
    Streamlit                :milestone, 2022-03, 0d
    Applica                  :milestone, 2022-08, 0d
    SnowConvert              :milestone, 2023-01, 0d
    LeapYear                 :milestone, 2023-02, 0d
    Neeva                    :milestone, 2023-05, 0d
    TruEra                   :milestone, 2024-05, 0d
    Datavolo                 :milestone, 2024-11, 0d
    Crunchy Data             :milestone, 2025-06, 0d
    Select Star              :milestone, 2025-11, 0d
    
    section Confluent 
    Immerok                  :milestone, 2023-01, 0d
    WarpStream               :milestone, 2024-09, 0d
    
    section Qlik 
    Talend                        :milestone, 2023-01, 0d
    Kyndi                         :milestone, 2024-01, 0d
    Upsolver                      :milestone, 2025-01, 0d
    
    section Data Quality & Observability
    IBM → Databand.ai                    :milestone, 2022-06, 0d
    Bigeye → Data Advantage Group        :milestone, 2023-06, 0d
    Datadog → Metaplane                  :milestone, 2025-04, 0d
    Soda → nannyML                       :milestone, 2025-06, 0d
    
    section Analytics & BI
    Alteryx → Trifacta                   :milestone, 2022-01, 0d
    Hex → Hashboard                      :milestone, 2025-04, 0d
    Coalesce.io → CastorDocs             :milestone, 2025-03, 0d
    
    section Streaming & Real-time
    Cloudflare → Arroyo                  :milestone, 2025-04, 0d
    Redis → Decodable                    :milestone, 2025-09, 0d
    
    section Database & Infrastructure
    NetApp → Instaclustr                 :milestone, 2022-04, 0d
    Vector Capital → SingleStore         :milestone, 2025-09, 0d

^4d105f

# Timeline

The timelines show each year’s acquisitions, starting in 2022. After that, we discuss related topics such as Bundling vs. Unbundling and the general state of the data engineering ecosystem.

# 2025

# 2024

# 2023

# 2022

# Bundling vs. Unbundling

Bundling vs Unbundling

# Once Called, Software is Eating the World

Once called “Software is eating the world”, now it seems the pendulum is swinging back to more unified and integrated data platforms.

I think the best way for OSS products to survive is to embrace the «Declarative Data Stack» approach, where integration happens with a single configuration file. By integrating with multiple tools, you gain the best of both worlds: a combination of integrated and open-source capabilities, along with end-to-end analytics.

However, making money from open-source is hard, but I still hope that many will pursue this path. When deciding on a tool, I will always pick the open-source one. To me, it builds trust, and because it’s shared as a gift for free to use by anyone, it makes me want to support it more.

Consider the Framework Laptops. They’re fully repairable, with every part replaceable, allowing you to swap out the screen or even the motherboard later on. This is something I want to support. Same with OSS. I believe the strategy shouldn’t be to cash out on OSS, but rather have it as a sign of valuing your customer by giving them a gift.

The money stream should be independent of OSS, if at all possible, so you have a clear distinction, and also don’t confuse them with the company down the road. Much easier said than done, but there are still many great data engineering companies that are doing a great job of exactly that. I hope it stays this way.

What do you think, should the vendor still have an open-source product, or focus on making money so it will sustain itself over time?

# Fivetran + dbt

Fivetran + dbt Merger

# Table Formats Market Updates

See Open Table Formats (Market Updates).

# AI Acquisition

Seperate note worthy acquisition in AI related to data engineering.

# OSS vs. Closed-Source

Open-Source vs Closed-Source Data Engineering

# Further Reads


Origin: Data Engineering
References: Earning Money with the Open-Source Model—Making Gifts
Created 2025-05-02