Bodo 2021.2 Release (Date: 2/16/2021)¶
This release includes many new features, bug fixes and usability improvements. Overall, 70 code patches were merged since the last release.
New Features and Improvements¶
-
Bodo is updated to use pandas 1.2 and Arrow 3.0 (latest)
-
Many improvements to error checking and reporting
-
Several documentation improvements
-
Support tuple return from Bodo functions where elements of the tuple have a mix of distributed and replicated distributions
-
Improvements in automatic loop unrolling to support column names generated in loops, e.g.
pd.DataFrame(X, columns=["y"] + ["x{}".format(i) for i in range(m)]) -
Improvements in caching to cover missing cases
-
Pandas coverage:
- Support column indices in
read_csv()dtypeargument. For example:df = pd.read_csv(fname, dtype={3: str}) - Support for
df.to_string() - Initial support for
pd.Categorical() - Support
Series.minandSeries.maxfor categorical data - Support
pd.to_datetime()with categorical string input - Support
pd.Series()constructor withoutdataargument specified - Support
dtype="str"in Series constructor - Support for
Series.to_dict() - Support for
Series.between() - Support
Series.loc[]setitem with boolean array index, such asS.loc[idx] = valwhereidxis a boolean array or Series - Support dictionary input in
Series.map(), such asS.map({1.0: "A", 4.0: "DD"}) - Support for
pd.TimedeltaIndexmin and max - Support for
pd.tseries.offsets.Week
- Support column indices in
-
Numpy coverage:
- Support
axis=1in distributednp.concatenate - Initial support for
np.random.multivariate_normal
- Support
-
Scikit-learn:
- Add
coef_attribute to SGDClassifier model. - Add
coef_attribute to LinearRegression model. - Support for
sklearn.preprocessing.LabelEncoderinside jit functions. - Support for
sklearn.metrics.r2_scoreinside jit functions.
- Add