Search
Open Table Format Catalogs
Open Catalogs are similar to the Hive Metastore before, an index for what tables you have in your data lake.
In a relational database, this is equivalent with the INFORMATION_SCHEMA where SELECT * FROM INFORMATION_SCHEMA.tables;
that most databases support.
A great overview from
YouTube Discussion:
Looking at the compatibility matrix in your image, here’s a nicely formatted markdown table:
Engine | Unity Catalog | Glue Catalog | Snowflake Horizon | Polaris Catalog | BigQuery Metastore |
---|---|---|---|---|---|
Databricks | π’ Full Support | π‘ Some Support | π΄ No Access | π΄ No Access | π΄ No Access |
AWS | π‘ Some Support | π’ Full Support | π‘ Some Support | π΄ No Access | π΄ No Access |
Fabric | π΄ No Access | π΄ No Access | π΄ No Access | π΄ No Access | π΄ No Access |
Snowflake | π’ Full Support | π‘ Some Support | π’ Full Support | π΄ No Access | π΄ No Access |
OSS Iceberg Clients | π’ Full Support | π‘ Some Support | π‘ Some Support | π’ Full Support | π΄ No Access |
BigQuery | π΄ No Access | π΄ No Access | π΄ No Access | π΄ No Access | π’ Full Support |
Legend:
- π’ Full Support
- π‘ Some Support
- π΄ No Access
Image inspired by The Whys of Managed Iceberg with Databricks - see img_Open Table Format Catalogs_1746013576709.webp
# Different Catalogs
Open Source Catalogs:
- Apache Polaris Catalog: Fully open source, designed for broad compatibility with Iceberg clients
- Iceberg Catalog: Reference implementation, lightweight and standards-compliant
- DuckLake: Catalog + Table Format in one by DuckDB Labs
Vendor-Managed Catalogs:
- Unity Catalog (Databricks): Advanced governance features, strong integration with Databricks ecosystem
- AWS Glue Catalog: Deep AWS integration, serverless metadata management
- Snowflake Horizon Catalog: Native Snowflake integration with governance capabilities
- BigQuery Metastore: Google Cloud native, designed for multi-engine support
Lightweight Alternatives: - File-based catalogs: Solutions like boring-catalog that use simple JSON files for basic catalog functionality
# Utilities
- GitHub - boringdata/boring-catalog: A lightweight, file-based Iceberg catalog implementation using a single JSON file (e.g., on S3, local disk, or any fsspec-compatible storage).
# Further Reads
Origin: Data Lake Table Format
References:
Created 2025-04-30