Skip to content

Bodo 2025.7 Release (Date: 06/27/2025)

πŸŽ‰ Highlights

In this release, we are excited to announce support for writing Iceberg tables and Parquet files, DataFrame GroupBy operations, and numerous other features. Refer to our documentation for a complete list of features now supported.

✨ New Features

  • Added Iceberg write support using DataFrame.to_iceberg(). Features include simple filesystem writes, partition spec, and sort order.
  • Support writing Parquet files using DataFrame.to_parquet().
  • Added support for simple filesystem reads in read_iceberg().
  • Support for DataFrame.groupby() with aggregate functions including sum, count, and max.
  • Support for DataFrameGroupBy and SeriesGroupBy aggregate()/agg().
  • Added 8 Series.str methods including str.extract() and str.split(), achieving Series.str method coverage of 96% (54 out of 56).
  • Added 5 Series reduction methods including Series.max() and Series.sum().
  • Support for pd.to_datetime() and timedelta types/methods.
  • Added top-level null check methods such as pd.isnull().
  • Support for Series sort_values().
  • Optimized support for sort_values() followed by head().
  • Support for DataFrame column renaming.
  • Support for arithmetic expression on DataFrames, e.g., df[β€œnew_col”] = df[β€œA”] + df[β€œB”].
  • Support for bodo.pandas.DataFrame/Series constructors.
  • Support for filtering expressions on Series, e.g., s[s > 10].

πŸ› Bug Fixes

  • Added fallback warnings for unsupported Series methods.
  • Improved DataFrame and Series expression support in filters and column assignments.

βš™οΈ Dependency Upgrades

  • Bodo now supports Python 3.13.
  • Removed many dependency version constraints.

2025.7.1

Minor release to fix a bug when reading Parquet files from S3.

2025.7.2

πŸŽ‰ Highlights

Bodo is now available on conda-forge for all supported platforms.

✨ New Features

  • Bodo.pandas.read_iceberg_table for reading PyIceberg tables.
  • Added 8 Series.dt methods/accessors including dt.isocalendar().
  • Improve the performance of DataFrame.apply calls significantly in some cases.
  • Support na_action and kws in Series.map() JIT support.

πŸ› Bug Fixes

  • Fixed a parallelism bug for distributing in-memory dataframes.
  • Fixed issues with repeated sub-plans in join sides.
  • Fixed memory leaks when distributing Pandas dataframes for plan execution.

βš™οΈ Dependency Upgrades

  • Upgraded zstd, boost, aws-sdk-cpp and pyarrow dependencies.

2025.7.3

Minor release to re-enable Python 3.9 support.

2025.7.4

✨ New Features

  • Updated quickstart Iceberg docs to use the dataframe library in examples.
  • Support series.mean in dataframe library.
  • Support s3a paths in Iceberg IO.

βš™οΈ Dependency Upgrades

  • Removed maximum Python version restriction.

2025.7.5

✨ New Features

  • Added support for Series methods including describe, agg, etc.
  • Support setting output of Series.str.cat back into the dataframe as a new column
  • Support filters on top of joins in plans that turn into non-equi joins
  • Suppressed excessive fallback warnings
  • Enabled column selection with Groupby getattr

🏎️ Performance Improvements

  • Added cache flag to read_csv inside DataFrame Library.
  • Improved performance of argument initialization inside of lazy plans.
  • Adjusted Iceberg and Parquet reader parameters to use less memory.

πŸ› Bug Fixes

  • Fixed merge output columns not created properly in some cases
  • Fixed parquet read issue with filter columns not used anywhere else in the code
  • Fixed Parquet read when Index metadata is missing