2025)¶

🎉 Highlights¶

In this release, we are excited to announce support for writing Iceberg tables and Parquet files, DataFrame GroupBy operations, and numerous other features. Refer to our documentation for a complete list of features now supported.

✨ New Features¶

Added Iceberg write support using DataFrame.to_iceberg(). Features include simple filesystem writes, partition spec, and sort order.
Support writing Parquet files using DataFrame.to_parquet().
Added support for simple filesystem reads in read_iceberg().
Support for DataFrame.groupby() with aggregate functions including sum, count, and max.
Support for DataFrameGroupBy and SeriesGroupBy aggregate()/agg().
Added 8 Series.str methods including str.extract() and str.split(), achieving Series.str method coverage of 96% (54 out of 56).
Added 5 Series reduction methods including Series.max() and Series.sum().
Support for pd.to_datetime() and timedelta types/methods.
Added top-level null check methods such as pd.isnull().
Support for Series sort_values().
Optimized support for sort_values() followed by head().
Support for DataFrame column renaming.
Support for arithmetic expression on DataFrames, e.g., df[“new_col”] = df[“A”] + df[“B”].
Support for bodo.pandas.DataFrame/Series constructors.
Support for filtering expressions on Series, e.g., s[s > 10].

🐛 Bug Fixes¶

Added fallback warnings for unsupported Series methods.
Improved DataFrame and Series expression support in filters and column assignments.

⚙️ Dependency Upgrades¶

Bodo now supports Python 3.13.
Removed many dependency version constraints.

2025.7.1¶

Minor release to fix a bug when reading Parquet files from S3.

2025.7.2¶

🎉 Highlights¶

Bodo is now available on conda-forge for all supported platforms.

✨ New Features¶

Bodo.pandas.read_iceberg_table for reading PyIceberg tables.
Added 8 Series.dt methods/accessors including dt.isocalendar().
Improve the performance of DataFrame.apply calls significantly in some cases.
Support na_action and kws in Series.map() JIT support.

🐛 Bug Fixes¶

Fixed a parallelism bug for distributing in-memory dataframes.
Fixed issues with repeated sub-plans in join sides.
Fixed memory leaks when distributing Pandas dataframes for plan execution.

⚙️ Dependency Upgrades¶

Upgraded zstd, boost, aws-sdk-cpp and pyarrow dependencies.

2025.7.3¶

Minor release to re-enable Python 3.9 support.

2025.7.4¶

✨ New Features¶

Updated quickstart Iceberg docs to use the dataframe library in examples.
Support series.mean in dataframe library.
Support s3a paths in Iceberg IO.

⚙️ Dependency Upgrades¶

Removed maximum Python version restriction.

2025.7.5¶

✨ New Features¶

Added support for Series methods including describe, agg, etc.
Support setting output of Series.str.cat back into the dataframe as a new column
Support filters on top of joins in plans that turn into non-equi joins
Suppressed excessive fallback warnings
Enabled column selection with Groupby getattr

🏎️ Performance Improvements¶

Added cache flag to read_csv inside DataFrame Library.
Improved performance of argument initialization inside of lazy plans.
Adjusted Iceberg and Parquet reader parameters to use less memory.

🐛 Bug Fixes¶

Fixed merge output columns not created properly in some cases
Fixed parquet read issue with filter columns not used anywhere else in the code
Fixed Parquet read when Index metadata is missing