Search

Search IconIcon to open search

Transaction Log in Open Table Formats

Last updated by Simon Späti

Transaction logs are the fundamental backbone of modern Open Table Formats, enabling ACID transactions, metadata management, time travel capabilities, and concurrent operations on Data Lakes. This note explores how transaction logs are implemented across three major open table formats: Delta Lake, Apache Hudi, and Apache Iceberg.

# Core Concepts of Transaction Logs

A transaction log is an ordered record of all operations performed on a table since its creation. It serves as:

  1. Single Source of Truth: The definitive record of a table’s state and history
  2. ACID Transaction Enabler: Ensures atomicity, consistency, isolation, and durability
  3. Metadata Management System: Tracks schema, partitioning, and file information
  4. Concurrency Controller: Manages multiple simultaneous reads and writes
  5. Time Travel Facilitator: Enables querying historical table states

# Examples

# Delta Lake Transaction Log

Delta Lake Transaction Log Structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
my_table/
├── _delta_log/            # Transaction log directory
│   ├── 00000000000000000000.json  # First commit
│   ├── 00000000000000000001.json  # Second commit
│   ├── 00000000000000000002.json  # Third commit
│   ├── ...
│   ├── 00000000000000000010.checkpoint.parquet  # Checkpoint file (every 10 commits)
│   └── ...
├── date=2019-01-01/       # Optional partition directories
│   └── file-1.parquet     # Data files
└── ...

See detail deep dive in: Transaction Log (Delta Lake).

# Apache Iceberg Transaction Log

Structure and Implementation

  • Layered Architecture: Comprises catalog layer, metadata layer, and data layer
  • Metadata Files: Store global table metadata (schema, partitioning, properties)
  • Snapshots: Represent table state at specific points in time
  • Manifest Files: Track data files, including their locations, sizes, and statistics
  • Atomic Swaps: Table state updates create new metadata files replaced via atomic swaps

Key Functions

  • Catalog Operations: Atomic operations at the catalog level ensure transaction correctness
  • Optimistic Concurrency: Uses sequence numbers to maintain consistency with concurrent transactions
  • Metadata Logging: Records history of metadata changes for rollback capabilities
  • Schema Evolution: Supports schema changes without table rewrites

Apache Iceberg Transaction Log Structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
my_table/
├── metadata/              # Metadata directory
│   ├── version-hint.text  # Points to latest metadata file
│   ├── v1.metadata.json   # First version metadata file
│   ├── v2.metadata.json   # Second version metadata file
│   ├── snap-<uuid>.avro   # Manifest list for first snapshot
│   ├── snap-<uuid>.avro   # Manifest list for second snapshot
│   └── <uuid>.avro        # Manifest file with data file details
├── data/                  # Data files directory
│   └── <uuid>.parquet     # Actual data file
└── ...

More details:

# Apache Hudi Transaction Log

Structure and Implementation

  • Timeline-Based Architecture: Organizes transactions as actions along a timeline
  • File Organization: Uses directory-based approach with timestamped files and log files tracking changes
  • Metadata Table: Tracks file information for query optimization (default since v0.11.0)
  • Commit Files: Uses files with naming convention [timestamp].[transaction state] to track transaction states

Key Functions:

  • Record-Level Index: Maintains mapping between record keys and file groups
  • Optimistic Concurrency: File-level, log-based concurrency control based on instant times
  • Asynchronous Operations: Supports background operations like compaction without blocking ingestion
  • Copy-on-Write vs. Merge-on-Read: Offers two table types with different performance characteristics

Apache Hudi Transaction Log Structure:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
my_table/
├── .hoodie/               # Metadata directory
│   ├── hoodie.properties  # Table configuration
│   ├── 20230101120000.commit        # Commit metadata (successful)
│   ├── 20230101130000.commit.requested  # Transaction state: requested
│   ├── 20230101130000.commit.inflight   # Transaction state: in progress
│   ├── 20230101140000.deltacommit      # Delta commit for MOR tables
│   ├── 20230101150000.rollback         # Failed transaction rollback
│   ├── 20230101160000.clean            # Cleaning operation
│   ├── 20230101170000.compaction       # Compaction operation
│   ├── metadata/          # Metadata table (since v0.11.0)
│   ├── aux/               # Auxiliary files
│   └── .heartbeat/        # Heartbeat management
├── partition=value/       # Partition directories
│   ├── file1_v1.parquet   # Base file (COW table)
│   ├── file1_v2.parquet   # Updated base file after update
│   ├── file2.parquet      # Another base file
│   ├── file2.log.1        # Delta log file (MOR table)
│   └── file2.log.2        # Another delta log file
└── ...

More details:

  • Apache Hudi Concepts - Official documentation explaining Hudi’s timeline, file organization, and table structure.

# DuckLake Transaction Table

Since a short while, we also have DuckLake. This stores the metadata in a SQL database instead of metadata on disk.

# DuckLake Table Structure

The data model as of 2025-06-05 looks like this:

Check out more at DuckLake.

# Comparison of Transaction Log Approaches

Feature Delta Lake Apache Hudi Apache Iceberg
Concurrency Control Optimistic concurrency control with mutual exclusion and retry mechanism File-level, log-based concurrency control ordered by start instant times Sequence number-based optimistic concurrency control
Metadata Management JSON log files with Parquet checkpoints every 10 commits Timeline-based approach with metadata table for query optimization Layered approach with catalog pointing to metadata files
Update Handling Breaks operations into atomic commits recorded sequentially Offers Copy-on-Write and Merge-on-Read approaches for different performance needs Supports eager data file rewrites or delete deltas for faster updates
Performance Characteristics Efficient for append-heavy workloads with Spark integration Excels at update-heavy workloads with upserts and record-level indexing Offers strong query performance with optimized metadata handling
Time Travel Supports via transaction log processing Supports via timeline-based snapshots Supports via versioned metadata and snapshots
Origins Developed by Databricks Developed by Uber Developed by Netflix
Primary Integration Apache Spark Multiple engines with Spark, Flink, and Hive focus Multi-engine with strong Spark, Flink, Trino support
Schema Evolution Supported with column additions/deletions Supported with schema tracking Extensive support with in-place evolution

See also more on Open Table Formats, The Open Table Format Revolution, and Composable Open Data Platform: How Iceberg and Open Table Formats Are Reshaping Data Architecture.


Origin: Data Lake Table Format
References:
Created 2025-04-29