Search
Data Lake Table Formats (Open Table Formats)
Prominent table formats include Delta Lake, Apache Iceberg, and Apache Hudi.
Data lake table formats serve as databases-like features on top of distributed File Formats. Similar to a traditional table, these formats consolidate distributed files into a singular table, simplifying management. Consider them an abstraction layer that structures your physical data files into coherent tables.
My Latest Complete Write Ups
If you like to read my latest blog about this topics, not just the notes, here are my latest write-ups about this topics:
# Tools
Table format tools:
# Specialized Formats
- Apache Paimon: Real-time focused
- Lance: ML use-cases.
- DuckLake (a combination of OTF and Open Catalogs)
- Havasu (Spatial, on top of Iceberg)
# AI focused
# Why Open Table Formats
Because of the Open Data Architecture.
# Market Updates
- Databricks Data & AI Summit 2024: Databricks announced acquiring Tabular (Iceberg), the company behind Apache Iceberg. And as well as open sourcing Unity Catalog.
- Snowflake announced Polaris Catalog to integrate additional REST APIs to Iceberg and Snowflake.
- XTable (formerly: OneTable): OneTable is an omnidirectional converter. Similar to Project UniForm (Databricks) and Delta Universal Format (UniForm).
- Snowflake introduced Iceberg tables at their 2022 summit. (Delve into how Iceberg synergizes with Snowflake, particularly its open-source aspects.)
- Databricks at the Databricks Data & AI Summit 2022, declared Delta Lake fully open-source, including all features in Delta Lake 2.0 (e.g., z-ordering, optimization). This solidifies its position as a leading format, further enhanced by the open-source sharing feature Delta Sharing and Delta Live Table. They’re also developing an open-source Market Place - Databricks.
Comment on Reddit.
# General Features
- DML and SQL Support: Inserts, Upserts, Deletes.
- Provides merging, updating, and deleting directly on distributed files.
- Some formats also support Scala/Java and Python APIs in addition to SQL.
- Backward compatible with Schema Evolution and Enforcement
- Automatic Schema Evolution is a key feature in Table Formats, as changing formats is still a challenging task in today’s data engineering work. Evolution here means we can add new columns without breaking anything or even enlarging some types. You can even rename or reorder columns, although that might break backward compatibilities, but we can change one table and the Table Format takes care of updating it across all distributed files. Best of all, it does not require a rewrite of your table and underlying files.
- ACID Transactions, Rollback, Concurrency Control (e.g., Delta has Optimistic Concurrency)
- A transaction is designed to either commit all changes or rollback, ensuring you never end up in an inconsistent state.
- Integrated with various cluster-computing frameworks (e.g., Iceberg with Apache Spark, Trino, Flink, Presto, Apache Hive, Impala; Hudi with Apache Spark, Presto, Trino, Hive; and Delta with Spark, Presto, Trino, Athena).
- Time Travel, Audit History with Transaction Log (Delta Lake) and Rollback
- Time travel allows the Table Format to version the big data stored in your data lake, enabling access to any historical version of that data. This simplifies data management, makes auditing easy, allows rolling back data in case of accidental bad writes or deletes, and helps reproduce experiments and reports.
- All formats assist with GDPR compliance.
- The Transaction Log (Open Table Formats) is an ordered record of every transaction performed on a table since its inception. For example, Delta Lake creates a single folder called
_delta_log
(details in Delta Lake). This log is a common component across many of its features, including ACID transactions, scalable metadata handling, and time travel. - Scalable Metadata Handling: These table formats are not only equipped to handle a large amount and big files, but they also manage metadata at scale with automatic checkpointing and summarization.
- ℹ️ Helpful for GDPR compliance.
- Time-travel: Enables reproducible queries using the exact same table snapshot, or lets users easily examine changes. Version rollback allows for quick correction of problems by resetting tables to a good state.
- Avoids the need to implement complex Slowly Changing Dimension (Type 2), and with all transactions being recorded in the transaction log and having each version handy, you can extract changes as you would with Change Data Capture (CDC) (if done often enough to not lose intermediate changes).
- Partitioning /
Partitioning Evolution
- These formats handle the tedious and error-prone task of producing partition values for rows in a table and automatically skip unnecessary partitions and files. No extra filters are needed for fast queries, and the table layout can be updated as data or queries change.
- File Sizing, Data Clustering with Compaction
- Data can be compacted with OPTIMIZE - Delta Lake in Delta Lake and deletion of old versions can be managed by setting a retention date with VACUUM.
- Data compaction is supported out-of-the-box, offering different rewrite strategies such as bin-packing or sorting to optimize file layout and size.
- Unified Batch and Streaming Source and Sink (eliminating the need for Kappa Architecture)
- Supports streaming ingestion, Built-in CDC sources & tools (Hudi).
- It’s advantageous that it doesn’t matter if you’re reading from a stream or batch. Delta supports both a single API and a target sink. This is well explained in Beyond Lambda: Introducing Delta Architecture or through code examples. The often-used MERGE statement in SQL can be applied on your distributed files as well with Delta, including schema evolution and ACID transaction.
- Data Sharing
- For example, Delta Sharing: An open protocol for secure data sharing, making it simple to share data with other organizations regardless of the computing platforms they use.
-
Change Data Feed (CDF)
- The CDF feature enables tables to track row-level changes between versions. When enabled, it records “change events” for all data written into the table, including row data and metadata indicating whether the row was inserted, deleted, or updated. Currently, this is supported mainly by Delta.
# Comparisons: Hudi, Iceberg, Delta
A detailed comparison of these formats is available in Comparison of Data Lake Table Formats (Iceberg, Hudi and Delta Lake).
Typically, Parquet’s binary columnar file format is the prime choice for storing data for analytics. However, there are situations where you may want your table format to use other file formats like AVRO or ORC. Below is a chart that shows which table formats are allowed to make up the data files of a table.
# Apache Hudi vs Apache Iceberg vs Delta Lake
Apache spark summit 2020 “A Thorough Comparison Of Delta Lake, Iceberg And Hudi”
# Iceberg compared to the rest
Full list of open table format comparison:
Feature | Apache Iceberg | Other Table Formats | Benefits |
---|---|---|---|
ACID Transactions | ✅ Full ACID transactions with optimistic concurrency | ✅ Delta Lake: Full ACID with optimistic concurrency ✅ Hudi: Full ACID with optimistic or pessimistic concurrency ⚠️ Lance: Versioning support but not full ACID |
Ensures data consistency across operations and prevents data corruption during concurrent writes |
Schema Evolution | ✅ Full schema evolution (add, drop, rename, reorder, update types) | ✅ Delta Lake: Similar capabilities ✅ Hudi: Similar capabilities ⚠️ Lance: Basic support |
Allows tables to evolve without rewriting data or breaking compatibility |
Time Travel | ✅ Supports time travel and version rollback | ✅ Delta Lake: Supports ✅ Hudi: Supports ✅ Lance: Zero-copy automatic versioning |
Reproduces queries at specific points in time, enables rollback to previous states |
Partitioning | ✅ Hidden partitioning with transformations (day, hour, bucket) and partition evolution | ⚠️ Delta Lake: Standard partitioning ⚠️ Hudi: Similar with clustering ⚠️ Lance: Limited partitioning |
Automatic partition pruning without explicit filters, can evolve partition strategy without rewriting data |
File Format Support | ✅ Supports Parquet, ORC, and Avro | ⚠️ Delta Lake: Primarily Parquet ✅ Hudi: Parquet, Avro, ORC ⚠️ Lance: Custom columnar format |
Flexibility in storage choices to match performance needs |
Copy-on-Write vs Merge-on-Read | ✅ Supports both | ✅ Delta Lake: Supports both ✅ Hudi: Supports both ⚠️ Lance: Not explicitly defined |
Balance between write performance and read performance |
Data Skipping | ✅ Column statistics in manifests for data skipping | ✅ Delta Lake: Column stats in checkpoint ✅ Hudi: Column stats with HFile formats ✅ Lance: Supports data skipping |
Improves query performance by skipping irrelevant data files |
Compaction | ✅ Data compaction with bin-packing, sorting, and Z-order | ✅ Delta Lake: OPTIMIZE with Z-order ✅ Hudi: Managed compaction services ⚠️ Lance: Not explicitly defined |
Optimizes file layout and size for better performance |
Incremental Processing | ⚠️ Change queries for incremental data processing | ⚠️ Delta Lake: Change Data Feed in 2.0+ ✅ Hudi: Incremental queries from beginning ❌ Lance: Not explicitly defined |
Enables efficient processing of only data changes |
Ecosystem Integration | ✅ Spark, Flink, Trino, Presto, Hive, DuckDB | ⚠️ Delta Lake: Strong Databricks integration ✅ Hudi: Spark, Presto, Hive, Flink ⚠️ Lance: Arrow, Pandas, Polars, DuckDB |
Determines compatibility with existing data tools |
Governance & Community | ✅ Apache Software Foundation Netflix, Tabular (acq. by Databricks) |
✅ Delta Lake: Linux Foundation, Databricks ✅ Hudi: ASF, Uber ⚠️ Lance: Open source, LanceDB |
Indicates project stability and development approach |
Metadata Management | ✅ Avro manifest files with metadata table | ⚠️ Delta Lake: Parquet checkpoints ✅ Hudi: MoR-based metadata table ⚠️ Lance: Different architecture |
Impacts metadata performance and scaling |
Concurrency Model | ✅ Table-level validation for conflict detection | ✅ Delta Lake: Optimistic concurrency ✅ Hudi: Optimistic or pessimistic ❌ Lance: Not explicitly defined |
Determines how concurrent writers are handled |
Real-time Updates | ⚠️ Not optimized for real-time | ⚠️ Delta Lake: Not optimized ✅ Hudi: Streaming support ⚠️ Lance: Not optimized ✅ Paimon: Optimized for real-time updates |
Important for streaming use cases and low-latency requirements |
Cloud Integration | ✅ Works with all major clouds | ✅ Delta Lake: Works with all clouds, optimized for Databricks ✅ Hudi: Works with all clouds ✅ Lance: Cloud-agnostic |
Flexibility in deployment options |
Deletion Support | ✅ Position and equality deletes | ✅ Delta Lake: Row-level deletes ✅ Hudi: Row-level deletes ⚠️ Lance: Basic delete support |
Affects GDPR compliance and data cleanup operations |
# Format Conversion
Exploring tools like Delta Universal Format (UniForm) and XTable can facilitate format transitions.
# History and Evolution
gantt title Managed Iceberg Ecosystem dateFormat YYYY axisFormat %Y section File Formats Apache Hadoop :milestone, m1, 2006, 0d Apache Hive (Metastore) :milestone, m2, 2008, 0d RCFile :milestone, m3, 2010, 0d Apache ORC :milestone, m4, 2013, 0d Apache Parquet :milestone, m5, 2013, 0d section Table Formats Apache Hudi :milestone, m6, 2016, 0d Delta Lake :milestone, m7, 2017, 0d Apache Iceberg :milestone, m8, 2017, 0d Apache Paimon :milestone, m9, 2022, 0d Databricks acquires Tabular :milestone, m9, 2024, 0d section Conversions Delta UniForm :milestone, m10, 2023, 0d Apache XTable :milestone, m11, 2023, 0d section Catalogs Glue Catalog :milestone, m13, 2021, 0d Unity Catalog :milestone, m12, 2022, 0d Apache Polaris Catalog :milestone, m16, 2024, 0d Snowflake Horizon Catalog :milestone, m14, 2024, 0d section Managed Services AWS S3 Tables :milestone, m13, 2024, 0d Cloudflare R2 Data Catalog :milestone, m14, 2025, 0d Managed Iceberg by Databricks :milestone, m15, 2025, 0d
See more on Rill | The Open Table Format Revolution: Why Hyperscalers Are Betting on Managed Iceberg
# Hive
Probably the first version with which was based on a distributed storage accessable with SQL was Apache Hive and Hadoop, see Apache Hive.
# Table Format Catalogs
Further insights are available on the Use transactional processing and my Data Lake/Lakehouse Guide where I wrote in more detail about this.
# Composable Data Stacks
Composable Data Stacks relate to table formats, as stacks like Lakehouse are built around open Table Formats.
# Down sides and Limits of Open Table Format
Some of the downsides:
- More fragmented ecosystem: Multiple competing formats (Iceberg, Delta, Hudi) -> this should get consolidated around Iceberg/Delta
- Low-level: These formats still require technical expertise and choosing/configuring an compute-engine compared to managed data warehouse solutions
- Potential compaction needed: Regular maintenance operations like compaction, optimization, and file cleanup are need to be done manually
- Lesser write concurrency: Not meant for multiple concurrent writers compared to traditional databases
- Added complexity: Requires implementing manual data governance, catalog management, and operational processes that are built-in with traditional data warehouses
- Latency and Performance trade-offs: Separation of compute and storage introduces network latency, and query performance. While cost-effective, they won’t never be as fast as tightly integrated, proprietary Modern OLAP Systems
- Catalog layer lock-in: Despite open formats, the catalog/metastore layer still presents vendor lock-in challenges, as compatibility between different catalog solutions remains limited
Also more on Iceberg Specific Limitation 2025.
# Further Reads
-
Onehouse’s Feature Comparison
- Update log
- 8/11/22 - Original publish date
- 1/11/23 - Refresh feature comparisons, added community stats + benchmarks
- 1/12/23 - Databricks contributed few minor corrections
- 10/31/23 - Minor edits
- 1/31/24 - Minor update about current state of OneTable
- Update log
- Dremio’s Comparison
- LakeFS’s Overview
- The ultimate guide to table format internals - all my writing so far — Jack Vanlightly
Origin:
Data Lake / Lakehouse Guide: Powered by Data Lake Table Formats (Delta Lake, Iceberg, Hudi)
References:
Created 2022-06-10