Bodo 2021.2 Release (Date: 2/16/2021)¶
This release includes many new features, bug fixes and usability improvements. Overall, 70 code patches were merged since the last release.
New Features and Improvements¶
-
Bodo is updated to use pandas 1.2 and Arrow 3.0 (latest)
-
Many improvements to error checking and reporting
-
Several documentation improvements
-
Support tuple return from Bodo functions where elements of the tuple have a mix of distributed and replicated distributions
-
Improvements in automatic loop unrolling to support column names generated in loops, e.g.
pd.DataFrame(X, columns=["y"] + ["x{}".format(i) for i in range(m)])
-
Improvements in caching to cover missing cases
-
Pandas coverage:
- Support column indices in
read_csv()
dtype
argument. For example:df = pd.read_csv(fname, dtype={3: str})
- Support for
df.to_string()
- Initial support for
pd.Categorical()
- Support
Series.min
andSeries.max
for categorical data - Support
pd.to_datetime()
with categorical string input - Support
pd.Series()
constructor withoutdata
argument specified - Support
dtype="str"
in Series constructor - Support for
Series.to_dict()
- Support for
Series.between()
- Support
Series.loc[]
setitem with boolean array index, such asS.loc[idx] = val
whereidx
is a boolean array or Series - Support dictionary input in
Series.map()
, such asS.map({1.0: "A", 4.0: "DD"})
- Support for
pd.TimedeltaIndex
min and max - Support for
pd.tseries.offsets.Week
- Support column indices in
-
Numpy coverage:
- Support
axis=1
in distributednp.concatenate
- Initial support for
np.random.multivariate_normal
- Support
-
Scikit-learn:
- Add
coef_
attribute to SGDClassifier model. - Add
coef_
attribute to LinearRegression model. - Support for
sklearn.preprocessing.LabelEncoder
inside jit functions. - Support for
sklearn.metrics.r2_score
inside jit functions.
- Add