Bodo 2022.7 Release (Date: 07/31/2022)¶
New Features and Improvements¶
Compilation / Performance improvements:
Groupbyoperations are now faster to compile and support super-wide DataFramesGroupby.apply()operations have improved compilation time, runtime memory usage and performance.- Most 
BodoSQLselect statements are now faster to compile. - Cache is now automatically invalidated when upgrading Bodo.
 
Iceberg:
- Added support for writing Iceberg tables via 
to_sql 
I/O:
to_csv,to_json, andto_parquetnow support a custom argument_bodo_file_prefixto specify the prefix of files written in distributed cases.- Snowflake data load now supports filter pushdown with 
Series.str.startswithandSeries.str.endswith. 
Pandas coverage:
read_csvandread_jsonnow support argumentsample_nrowsto set the number of rows that are sampled to infer column dtypes (by defaultsample_nrows=100).- Support for 
DataFrame.rank - Support for 
Groupby.ngroup - Added support for dictionary-encoded string arrays (that have reduced memory usage and execution time) in the following functions:
Groupby.minGroupby.maxGroupby.firstGroupby.lastGroupby.shiftGroupby.headGroupby.nuniqueGroupby.sumGroupby.cumsumGroupby.transform
 
BodoSQL:
- 
Added support for the following query syntax
QUALIFYGROUP BY GROUPING SETSGROUP BY CUBEGROUP BY ROLLING
 - 
Added support for the following functions:
IFFNULLIFZERONVL2ZEROIFNULL
 - 
Added support for the following windowed aggregation functions:
RANKDENSE_RANKPERCENT_RANKCUME_DIST
 - 
The following functions are much faster to compile:
ADDDATE/DATE_ADD/SUBDATE/DATE_SUBif the second argument is an integer columnASCIICHARCOALESCECONVDAYNAMEFORMATFROM_DAYSFROM_UNIXTIMEIFIFNULLINSTRLAST_DAYLEFTLOGLPADMAKEDATEMONTHNAMENULLIFNVLORDREPEATREPLACEREVERSERIGHTRPADSPACESTRCMPSUBSTRINGSUBSTRING_INDEXTIMESTAMPDIFF(if the unit is Month, Quarter, or Year)Unary -WEEKDAYYEAROFWEEKISO
 - 
Support for binary data in complex join operations
 - Support for UTF-8 string literals in queries (previously just ASCII).