2024)¶

Added support for pd.Series.argmin, pd.Series.argmax, pd.Series.str.removeprefix, pd.Series.str.removesuffix, pd.Series.str.casefold and Series.str.fullmatch.
Added support for pd.Series.str.partition with expand=True.
Added support for support HAVERSINE with Decimal input data type.
Changed Bodo logger defaults to stdout instead of stderr.

Changed Iceberg write to use Arrow azurefs instead of hadoop.
Changed to use Iceberg metadata instead of Parquet metadata for file scan planning to speed up Iceberg reads overall.
Added ability to fetch metadata for Snowflake-managed Iceberg tables at the beginning of query execution and in-parallel for faster Iceberg file scan planning.
Added streaming support for the window functions COUNT(X), COUNT_IF, BOOLAND_AGG, BOOLOR_AGG, BITAND_AGG, BITOR_AGG and BITXOR_AGG.
Added streaming support for the window functions LEAD, LAG and NTILE when a PARTITION BY clause is provided.
Added streaming support for the window functions FIRST_VALUE, LAST_VALUE, ANY_VALUE, MIN, and MAX on numeric data.
Ensured BodoSQL decomposes the window functions PERCENT_RANK, CUME_DIST and RATIO_TO_REPORT into other window functions that can be computed together with streaming.
Enabled computation of multiple window functions at once while streaming.
Enabled window functions computed with an OVER () window in streaming to spill data to disk, reducing peak memory utilization.
Improved the quality of BodoSQL planner to reduce redundant computation.
Added various optimizations for the streaming sort operator.
Made the BodoSQL planner more aggressive with eliminating common subexpressions that are not top-level expressions.

Improved the amount of possible query decorrelation in BodoSQL.
Fixed a bug in Snowflake-managed Iceberg table writer where the catalog integration creation could fail in the presence of another concurrent writer.
Fixed various bugs in the streaming sort operator.
Fixed behavior of pd.Series.str.split when n>=1 but the delimiter is not provided.
Improved stability when reading from CSV files.