Bodo 2021.1 Release (Date: 1/26/2021)¶
This release includes many new features, bug fixes and performance improvements. Overall, 61 code patches were merged since the last release.
New Features and Improvements¶
-
Connectors:
- Support filter pushdown when reading partitioned parquet
datasets: at compile time, Bodo detects if filters are applied
to a dataframe after
read_parquet, and generates code that applies those filters at read time so that only the required parquet files are read. - Support for
Series.to_csv() - Supports passing
fileanddtypearguments ofnp.fromfileas kwargs.
- Support filter pushdown when reading partitioned parquet
datasets: at compile time, Bodo detects if filters are applied
to a dataframe after
-
Support for f-strings in Bodo jitted functions
-
Support passing Bodo distributed JIT functions to other Bodo JIT functions
-
Pandas coverage:
- Support groupby with
pd.NamedAgg() - Support for
groupby.size - Support for
groupby.shift - Match input row order of pandas in
groupby.applywhen applicable - Support
min_periodsin rolling calls - Support passing a dictionary of data types to
df.astype() - Support dataframe setitem of multiple columns. For example:
df[["A", "B"]] = 1.3 - Support for
Index.get_loc() - Support
ddofargument (delta degrees of freedom) ofSeries.cov - Support
Series.is_monotonicproperty - Initial support for dictionaries in
Series.replace - Support
Series.reset_index(drop=True) - Support level argument with all levels in
reset_index() - Several documentation improvements
- Support groupby with
-
Scikit-learn:
- Support for
sklearn.model_selection.train_test_splitinside jit functions. - Support for
sklearn.preprocessing.MinMaxScalerinside jit functions.
- Support for