Search

Search IconIcon to open search

QuackStore: OLAP Cache Layer for DuckDB

Last updated by Simon Späti

OLAP Cache Layer locally with QuackStore.

Created by Coginiti.

# Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
INSTALL quackstore FROM community;
LOAD quackstore;

SET GLOBAL quackstore_cache_path = '/tmp/my_duckdb_cache.bin';
SET GLOBAL quackstore_cache_enabled = true;

.timer on

-- Slow: Downloads every time
select count(*) FROM read_csv('https://noaa-ghcn-pds.s3.amazonaws.com/csv.gz/by_year/2025.csv.gz');

-- Fast: Cached after first download
SUMMARIZE FROM read_csv('quackstore://https://noaa-ghcn-pds.s3.amazonaws.com/csv.gz/by_year/2025.csv.gz');

The outcome - first time without cache 49.366 - generating it:

1
2
3
4
count_star()
------------
26016543    
Run Time (s): real 49.366 user 51.777825 sys 0.449690

second time, cached this time is 3.304:

1
2
3
4
count_star()
------------
26016543    
Run Time (s): real 3.304 user 7.630344 sys 0.237343

The cache is 116 MB for this 26 million row dataset:

Even Summarize query:

1
SUMMARIZE FROM read_csv('quackstore://https://noaa-ghcn-pds.s3.amazonaws.com/csv.gz/by_year/2025.csv.gz');

was faster after, eventough this specific question was not cached yet. It only took 4.100 on first run.

# More from GitHub

THere are more example from GitHub:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
-- Cache a CSV file from GitHub
SELECT * FROM 'quackstore://https://raw.githubusercontent.com/owner/repo/main/data.csv';

-- Cache a single Parquet file from S3
SELECT * FROM parquet_scan('quackstore://s3://example_bucket/data/file.parquet');

-- Cache whole Iceberg catalog from S3
SELECT * FROM iceberg_scan('quackstore://s3://example_bucket/iceberg/catalog');

-- Cache any web resource
SELECT content FROM read_text('quackstore://https://example.com/file.txt');

Origin: DuckDB
References:
Created 2025-11-04