🧠 Second Brain


Search IconIcon to open search


Last updated Mar 13, 2024

Pandas is a software library crafted for the Python programming language, aimed at data manipulation and analysis. It particularly shines in providing data structures and operations for working with numerical tables and time series. As free software, it is distributed under the three-clause BSD license.

Koalas parallels Apache Spark in its focus on more distributed and larger datasets.

# Pandas 2.0

For further insights, explore Pandas 2.0 and its Ecosystem (Arrow, Polars, DuckDB) | ssp.sh.

# Star History

It’s fascinating to note that Apache Spark was initiated slightly after Pandas, yet their popularity trajectories are nearly identical: link

# Read dataframes with SQL

You can use pandasql or DuckDB:

Created 2022-08-07