🧠 Second Brain


Search IconIcon to open search


Last updated Feb 9, 2024

# What is Openness?

Openness is an environment where trust and visibility flourish, mainly through Open Source. The word open, which is part of the word, is vital. While open doesn’t mean all the code is open source on GitHub, primary pillars are based on open source.

Openness avoids vendor lock-in. Opposite of buying a service where all of it is proprietary and like a black box to you, you could change vendors along the line as your data is stored in an Open Standards Data Lake File Format such as a Data Lake Table Format. Your data connectors to integrate data into your Data Warehouse can be fixed manually in an emergency. Or the central Orchestration tool gets discontinued; you still have all the code and can run everything.

The list here could go on and on. The main point is to work collaboratively and sustainably, especially in a world where resources are scarce.

# Examples

A good example is to build an Open Data Stack, where we use tools from the Modern Data Stack that are open source and try to integrate them into a homogeneous Data Stack.

Other famous examples are the Data Lakehouse from Databricks, where they build a sustainable business model around open standards, based on open source with Spark and Delta Lake, but with additional proprietary features such as collaborative notebooks, Photon Engine, Machine Learning capabilities, and many more.

Similar things happen at SnowflakeDB, where they have closed-source Cloud Data Warehouses, but integrate into open standards Apache Iceberg (called Iceberg Table) or integration with open-source tools such as Snowplow, dbt, etc.

Dremio is another example where they build a business model on top of Apache Iceberg and Apache Arrow, eliminating vendor lock-in for more prominent organizations.

Created 2023-01-25