🧠 Second Brain

Cube

Last updated Mar 8, 2024

Formerly known as Cube.js, now simply Cube on Cube.dev. Cube is a Semantic Layer that is built as an OLAP cube capabilities but includes Analytics API capabilities too with fetching data with SQL, REST, and GraphQL out of the box.

# Cube Store (Cache Layer)

In Episode 2: Headless BI with Pavel Tiunov - The Analytics Everywhere Podcast | Podcast on Spotify, Pavel Tiunov mentioned that initially, they experimented with MySQL or Postgres for the OLAP Cache Layer, but these weren’t sufficiently fast.

Consequently, they developed their unique solution using DataFusion and custom coding. Their blog post details this journey:

Historically, pre-aggregations were either stored alongside source data in a database (e.g., PostgreSQL or MySQL) or in a custom-provisioned instance of the same databases for read-only or cost-ineffective data sources (e.g., AWS Athena or BigQuery). Typically, these would be asynchronously refreshed by a dedicated worker instance. Although this was a viable solution, the pre-aggregation database often became a scalability bottleneck for the analytical API.

This breakthrough in the OLAP Cache Layer is quite remarkable. A particularly insightful article on how Cube achieves sub-second query times and manages compute is RW Cube on Latency and Caching.

Further details can be found in Cube Store.

# Integrations / Collaborations

Cube has integrated with various other metrics layers like dbt (see an example here: Combining dbt Metrics with API, Caching, and Access Control - Cube Blog). Their focus extends to the OLAP Cache Layer, Security, and Data Governance.

# Embedded Analytics with MotherDuck

An Elegant Data Stack for Embedded Analytics

# Dashboards

Their initial implementation was with Superset, but now includes many others.

# Caching Layer

Cube replaced Redis with their bespoke solution, Cube Store. Details in RW Replacing Redis With Cube Store - Cube Blog.

A workflow illustrating idempotent query execution with caching is depicted here:

# Introducing Views in Cube

Introduced on (2022-10-12) as detailed in this article: RW Introducing Views for Defining and Managing Metrics - Cube Blog.

This new feature allows combining cubes, like companies and users, to create an interface showing active users.

# My Quick Take

The concept of a view layer is fascinating, reminiscent of my days creating Data Marts with Views in Oracle.

To be honest, I’m still exploring the nuances of Cube. My initial assumption was that Cube would function like data marts, akin to a singular view. However, the visualizations suggest a more intricate interface design. The idea of establishing contracts and schemas is intriguing and logical. This aspect of Cube certainly piques my interest for deeper exploration.

On a side note, I was pleasantly surprised to discover that metrics can be defined using JavaScript in Cube. While Python would have been my preference, JavaScript is a suitable choice given its integration with the frontend.

This ties into my recent discussion with Igor Lukanin about the workings of Cube and broader topics.

Origin:
References:
Created 2022-04-10