🧠 Second Brain

Search

Search IconIcon to open search

Ibis (Python)

Last updated Feb 16, 2025

Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It helps Python users explore and transform data of any size, stored anywhere.

Ibis has three primary components:

  1. A dataframe API for Python. Python users can write Ibis code to manipulate tabular data.
  2. Interfaces to 15+ query engines.Β Wherever data is stored, people can use Ibis as their API of choice to communicate with any of those query engines.
  3. Deferred execution. Ibis uses deferred execution, so execution of code is pushed to the query engine. Users can execute at the speed of their backend, not their local computer.

Why Use Ibis?

Ibis aims to be a future-proof solution to interacting with data using Python and can accomplish this goal through its main features:

Upgraded to DuckDB
“default DuckDB backend, and DuckDB isΒ muchΒ more performant”: Farewell pandas, and thanks for all the fish. – Ibis

Common Use Cases

Backends

Ibis acts as a universal frontend to the following systems:

The list of supported backends is continuously growing. Anyone can get involved in adding new ones! Learn more about contributing to ibis in our contributing documentation atΒ  https://github.com/ibis-project/ibis/blob/master/docs/CONTRIBUTING.md

# Installation

Install Ibis from PyPI with:

1
pip install 'ibis-framework[duckdb]'

Getting Started with Ibis

We provide a number of tutorial and example notebooks in theΒ  ibis-examples. The easiest way to try these out is through the online interactive notebook environment provided here:Β  Binder

You can also get started analyzing any dataset, anywhere with just a few lines of Ibis code. Here’s an example of how to use Ibis with a SQLite database.

Download the SQLite database from theΒ ibis-tutorial-dataΒ GCS (Google Cloud Storage) bucket, then connect to it using ibis.

1
curl -LsS -o geography.db 'https://storage.googleapis.com/ibis-tutorial-data/geography.db'

Connect to the database and show the available tables

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
>>> import ibis
>>> from ibis import _
>>> ibis.options.interactive = True
>>> con = ibis.sqlite.connect("geography.db")
>>> con.tables
Tables
------
- countries
- gdp
- independence

Choose theΒ countriesΒ table and preview its first few rows

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
>>> countries = con.tables.countries
>>> countries.head()
┏━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ iso_alpha2 ┃ iso_alpha3 ┃ iso_numeric ┃ fips   ┃ name                 ┃ capital          ┃ area_km2 ┃ population ┃ continent ┃
┑━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━┩
β”‚ string     β”‚ string     β”‚ int32       β”‚ string β”‚ string               β”‚ string           β”‚ float64  β”‚ int32      β”‚ string    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ AD         β”‚ AND        β”‚          20 β”‚ AN     β”‚ Andorra              β”‚ Andorra la Vella β”‚    468.0 β”‚      84000 β”‚ EU        β”‚
β”‚ AE         β”‚ ARE        β”‚         784 β”‚ AE     β”‚ United Arab Emirates β”‚ Abu Dhabi        β”‚  82880.0 β”‚    4975593 β”‚ AS        β”‚
β”‚ AF         β”‚ AFG        β”‚           4 β”‚ AF     β”‚ Afghanistan          β”‚ Kabul            β”‚ 647500.0 β”‚   29121286 β”‚ AS        β”‚
β”‚ AG         β”‚ ATG        β”‚          28 β”‚ AC     β”‚ Antigua and Barbuda  β”‚ St. Johns        β”‚    443.0 β”‚      86754 β”‚ NA        β”‚
β”‚ AI         β”‚ AIA        β”‚         660 β”‚ AV     β”‚ Anguilla             β”‚ The Valley       β”‚    102.0 β”‚      13254 β”‚ NA        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Show the 5 least populous countries in Asia

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
>>> (
...     countries.filter(_.continent == "AS")
...     .select("name", "population")
...     .order_by(_.population)
...     .limit(5)
... )
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ name                           ┃ population ┃
┑━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
β”‚ string                         β”‚ int32      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Cocos [Keeling] Islands        β”‚        628 β”‚
β”‚ British Indian Ocean Territory β”‚       4000 β”‚
β”‚ Brunei                         β”‚     395027 β”‚
β”‚ Maldives                       β”‚     395650 β”‚
β”‚ Macao                          β”‚     449198 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Origin: Upcoming Data Engineering Tools : r/dataengineering
References:
Created 2023-09-25