Skip to content

DataFrameGroupBy.agg

DataFrameGroupBy.agg(func=None, engine=None, engine_kwargs=None, **kwargs) -> BodoDataFrame

Apply one or more aggregate functions to groups of data in a BodoDataFrame. This method is the same as DataFrameGroupBy.aggregate.

Parameters

func : function, str, list, dict or None: Function(s) to use for aggregating the data. Acceptable combinations are:

  • A supported function e.g. sum
  • The name of a supported aggregation function e.g. "sum"
  • A list of functions, which will be applied to each selected column e.g. ["sum", "count"]
  • A dictionary mapping column name to aggregate function e.g. {"col_1": "sum", "col_2": "mean"}
  • None along with key word arguments specifying Named Aggregates.

Refer to our documentation for aggregate functions that are currently supported. Any other combination of arguments or user defined functions will either fallback to Pandas DataFrameGroupBy.agg or raise a descriptive error.

**kwargs Key word arguments are used to create Named Aggregations and should be in the form new_name=pd.NamedAgg(column_name, function) or simply new_name=(column_name, function).

Note

The engine and engine_kwargs parameters are not supported, and will trigger a fallback to Pandas if specified.

Returns

BodoDataFrame

Examples

import bodo.pandas as bd

bdf1 = bd.DataFrame({
    "A": ["foo", "foo", "bar", "bar"],
    "C": [1, 2, 3, 4],
    "D": ["A", "A", "C", "D"]
})

bdf2 = bdf1.groupby("A").agg("sum")

print(bdf2)
Output:
     C   D
A
bar  7  CD
foo  3  AA


bdf3 = bdf1.groupby("A").agg(["sum", "count"])

print(bdf3)
Output:
      C         D
    sum count sum count
A
bar   7     2  CD     2
foo   3     2  AA     2


bdf4 = bdf1.groupby("A").agg({"C": "mean", "D": "nunique"})

print(bdf4)
Output:
       C  D
A
bar  3.5  2
foo  1.5  1


bdf5 = bdf1.groupby("A").agg(mean_C=bd.NamedAgg("C", "mean"), sum_D=bd.NamedAgg("D", "sum"))

print(bdf5)
Output:
     mean_C sum_D
A
bar     3.5    CD
foo     1.5    AA