Skip to content

Bodo 2021.2 Release (Date: 2/16/2021)

This release includes many new features, bug fixes and usability improvements. Overall, 70 code patches were merged since the last release.

New Features and Improvements

  • Bodo is updated to use pandas 1.2 and Arrow 3.0 (latest)

  • Many improvements to error checking and reporting

  • Several documentation improvements

  • Support tuple return from Bodo functions where elements of the tuple have a mix of distributed and replicated distributions

  • Improvements in automatic loop unrolling to support column names generated in loops, e.g. pd.DataFrame(X, columns=["y"] + ["x{}".format(i) for i in range(m)])

  • Improvements in caching to cover missing cases

  • Pandas coverage:

    • Support column indices in read_csv() dtype argument. For example: df = pd.read_csv(fname, dtype={3: str})
    • Support for df.to_string()
    • Initial support for pd.Categorical()
    • Support Series.min and Series.max for categorical data
    • Support pd.to_datetime() with categorical string input
    • Support pd.Series() constructor without data argument specified
    • Support dtype="str" in Series constructor
    • Support for Series.to_dict()
    • Support for Series.between()
    • Support Series.loc[] setitem with boolean array index, such as S.loc[idx] = val where idx is a boolean array or Series
    • Support dictionary input in, such as{1.0: "A", 4.0: "DD"})
    • Support for pd.TimedeltaIndex min and max
    • Support for pd.tseries.offsets.Week
  • Numpy coverage:

    • Support axis=1 in distributed np.concatenate
    • Initial support for np.random.multivariate_normal
  • Scikit-learn:

    • Add coef_ attribute to SGDClassifier model.
    • Add coef_ attribute to LinearRegression model.
    • Support for sklearn.preprocessing.LabelEncoder inside jit functions.
    • Support for sklearn.metrics.r2_score inside jit functions.