Bodo 2020.05 Release (Date: 05/06/2020)¶
New Features and Improvements¶
- 
- Bodo is updated to use the latest versions of Numba and Apache Arrow packages:
 - 
- numba 0.49.0
 - Apache Arrow 0.17.0
 
 
 - 
Various improvements to clarity and conciseness of error messages
 - 
Initial support for
pandas.DataFrame.to_sql() - 
pandas.read_sql()supportsqlandconpassed to Bodo-decorated functions - 
Added support for
pandas.read_json()andpandas.DataFrame.to_json()from & to POSIX, S3, and Hadoop File Systems. - 
Initial support for
pandas.read_excel() - 
numpy.fromfile()andnumpy.tofile()from and to S3, and Hadoop File Systems. - 
Reduction in number of requests in I/O read calls
 - 
Initial support for array of lists of fixed sized values
 - 
List of strings data type support for
pandas.DataFrame.join(),pandas.DataFrame.drop_duplicates(), andpandas.DataFrame.groupby() - 
pandas.Timestampsubtraction, min and max - 
Improved support for null values in datetime and timedelta operations
 - 
Support
copy()function for Series ofdecimal.Decimalanddatetime.datedata types and most Index types - 
Improved support for Series
decimal.Decimaldtype - 
String Series and Dataframe Column are now mutable and support inplace
fillna() - 
pandas.Series.round() - 
pandas.Dataframe.assign() - 
Support
groupby(...).first()operation - 
pandas.Dataframe.ilocsupport for extracting a subset of columns - 
numpy.array.sum(axis=0) - 
numpy.reshape()multi-dimensional distributed arrays - 
Initial implementation of experimental legacy mode
 - 
Proper error when using unsupported
pandas.(...)&pandas.Series.(...)functions - 
Improved robustness of
pandas.DataFrameinplace operations - 
Memory usage improvements
 - 
Type safety improvements
 - 
Compilation time improvements
 
Bug Fixes¶
- Fixed an issue in 
pandas.read_csv()reading a large CSV file in specific distributed cases numpy.dot()with empty vector/matrix input