Bodo 2021.12 Release (Date: 12/29/2021)¶
This release includes many new features and usability improvements. Overall, 67 code patches were merged since the last release.
New Features and Improvements¶
- 
Significantly upgrades to the Bodo documentation to improve the developer experience 
- 
Improvements to documentation and unsupported attribute handling for Pandas APIs 
- 
Significant enhancements to objmode user experience and robustness, such as automatic output data type checking and automatic conversion if possible 
- 
Improved support for repackage, such as support forreflags, better support for returningNonewhen necessary, and better catching of unsupported corner cases
- 
Support caching functions that take a string as input and create a file path using concatenation. For example: 
- 
Connectors: - Improved read_parquetruntime performance when reading from S3
- Decreased compilation time for read_csvon DataFrames with large number of columns (100)
 
- Improved 
- 
Improved compilation time for dataframes with large number of columns (>10,000) 
- 
Improved NA handling in User Defined Functions with df.apply when functions are not inlined 
- 
Support for using logging.RootLogger.infowhen passing the logger as an argument to a JIT function
- 
Support for datetime.datetime.today
- 
Simpler bodo.scattervusage from regular Python. Other ranks are ignored but not required to haveNoneas their data
- 
Improved support for map arrays in various operations 
- 
Support feature_importances_of XGBoost
- 
Support predict_probaandpredict_log_probain Scikit-learn classifier algorithms
- 
Pandas: - Support for Bodo specific argument _bodo_upcast_to_float64in pd.read_csv. This can be used when all data is numeric but schema inference cannot accurate predict data types.
- Support for using DataFrame.to_parquetwith "wide" DataFrames with large number of columns
- Support for storing a DateTimeIndexwithDataFrame.to_parquet
- Support for the 'method' argument in DataFrame.fillnaandSeries.fillna
- Support for Series.bfill,Series.ffill,Series.pad, andSeries.backfill
- Support for Series.keys
- Support for Series.infer_objectsandDataFrame.infer_objects
- Decreased runtime when calling .astype("categorical")on Series with large numbers of categories
 
- Support for Bodo specific argument