Search

Search IconIcon to open search

Apache Iceberg

Last updated Apr 11, 2025

Iceberg is a high-performance Data Lake Table Format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data while making it possible for engines to safely work with the same tables, at the same time.

The project was originally developed at Netflix to solve long-standing issues with their usage of huge, petabyte-scale tables. It was open-sourced in 2018 as an Apache Incubator project and graduated from the incubator on the 19th of May 2020. Their first public commit was 2017-12-19. More on the story on RW A Short Introduction to Apache Iceberg by Christine Mathiesen Expedia Group Technology Medium.

Also contains a manifest file same as delta lake.

Made for the foundation layer of Data Lake’s.

# Features

These are all existing in Delta Lake as well and are similar to Apache Hudi.

# History

Pushed by Dremio and Snowflake with their Iceberg Table. Google Cloud announced support as well 2023-06-20.

Tabular (Iceberg) is the independent data platform built by the original creators of Apache Iceberg. Tabular addresses the pain data engineers and data scientists endure fighting the shortcomings of their data infrastructure. Tabular was founded by Netflix alumniĀ Ryan Blue, Dan Weeks and Jason Reid. Acquired by Databricks.


Origin: Data Lake Table Format
References: Apache Iceberg
Created 2022-08-11