Skip to content

Bodo 2026.5 Release (Date: 05/22/2026)

πŸŽ‰ Highlights

This release, we are excited to add support for more GPU operators including sort, Iceberg I/O, and string functions in Bodo DataFrames' GPU execution mode, available through the CUDA-enabled variant of our conda package, in addition to several bug fixes and UX improvements.

✨ New Features

GPU execution mode

The following operators now run on GPUs:

  • Iceberg read and write support: DataFrame.to_iceberg() and read_iceberg()

  • sort_values()

  • Cross joins

  • String functions: str.contains, str.match, str.slice, and str.strip

  • Series.isin()

  • Series.notna() & Series.notnull()

  • Taking the power of a Series (i.e. Series.__pow__)

  • SeriesGroupby & DataFrameGroupby all() and any()

In addition to these new operators, we made the following improvements:

  • Automatically set OMPI_MCA_pml=ucx during package import; users no longer have to explicitly set OMPI_MCA_pml.

  • Added support for the dropna argument in groupby.

  • Changed GPU batch size behavior to adjust batch size based on system GPU memory.

  • Fixed correctness issue in groupby nunique().

  • Fixed correctness issues with groupby Pandas fallback.

  • Fixed hang in broadcast join.

  • Fixed hang in pipelines that end in limit and transfer data between CPU and GPU.

  • Handle empty data/all null data in Series reductions.

  • Write Pandas metadata in to_parquet().

Bodo Features

The following features were added to both the CUDA and non-CUDA variants of the Bodo package:

  • Added support for taking the power of a Series (i.e. Series.__pow__) on CPU.

  • Added support for SeriesGroupby & DataFrameGroupby all() and any() on CPU.

  • Added support for specifying lists of functions in SeriesGroupby & DataFrameGroupby aggregate() function dicts (e.g. df.groupby(...).agg({β€œcol1”: [β€œmean”, β€œcount”], β€œcol2”: [β€œsum”, β€œcount”]})).

πŸ› Bug Fixes

  • Fixed null and NA handling to match Pandas.

  • Fixed bug in parallel join's termination condition.

  • Fixed output schema of groupby to ensure Arrow types are used in key columns.

πŸ› οΈ Infrastructure

  • Added option in Pixi to build Bodo with CUDA aware MPICH, enabling GPU execution in multi-node clusters.

  • Made improvements to the BodoSQL C++ backend, enabling more types of queries to run without falling back to JIT compilation.

βš™οΈ Dependency Changes

  • Upgraded Numba dependency to 0.65.