Skip to content

Bodo 2025.2 Release (Date: 02/13/2025)

=====================================

🎉 Highlights

We've started to revamp our CSV and JSON reader and writer to work more inline with our Parquet I/O. As part of this release, we standardized multiple smaller features and filesystem support. We’ve just begun, so look forward to more changes!

In addition, we also started working on improving our compilation time by first creating a global cache for internal functions. This can provide dramatic speedups on subsequent uses of Bodo.

✨ New Features

  • Support reading CSV/JSON/Parquet data from HuggingFace
  • Support Google Cloud Storage (GCS) for CSV/JSON/Parquet I/O
  • Support glob format in CSV read
  • Support zstd compression in CSV/JSON I/O
  • Improved reliability of spawn destructors for lazy data structures

🏎️ Performance Improvements

  • Improved compilation time of dataframe unboxing
  • Improved compilation time by caching functions internal to Bodo across program runs

🐞 Bug Fixes

  • Improved the error message when Bodo detects an OOM to provide potential solutions and paths forward
  • Fixed a bug in parallel read of JSON lines files
  • Fix the Pandas warning that appears when using to_csv or to_json

⚙️ Dependency Upgrades

  • Upgraded Calcite to 1.38
  • Upgraded Numba to 0.61