# Aggregation & Window Functions¶

An aggregation function can be used to combine data across many rows to form a single answer. Aggregations can be done with a `GROUP BY` clause, in which case the combined value is calculated once per unique combination of groupbing keys. Aggregations can also be done without the `GROUP BY` clause, in which case a single value is outputted by calculating the aggregation across all rows.

For example:

``````SELECT AVG(A) FROM table1 GROUP BY B

SELECT COUNT(Distinct A) FROM table1
``````

Window functions can be used to compute an aggregation across a row and its surrounding rows. Most window functions have the following syntax:

``````SELECT WINDOW_FN(ARG1, ..., ARGN) OVER (PARTITION BY PARTITION_COLUMN_1, ..., PARTITION_COLUMN_N ORDER BY SORT_COLUMN_1, ..., SORT_COLUMN_N ROWS BETWEEN <LOWER_BOUND> AND <UPPER_BOUND>) FROM table_name
``````
The `ROWS BETWEEN ROWS BETWEEN <LOWER_BOUND> AND <UPPER_BOUND>` section is used to specify the window over which to compute the function. A bound can can come before the current row, using `PRECEDING` or after the current row, using `FOLLOWING`. The bounds can be relative (i.e. `N PRECEDING` or `N FOLLOWING`), where `N` is a positive integer, or they can be absolute (i.e. `UNBOUNDED PRECEDING` or `UNBOUNDED FOLLOWING`).

For example:

``````SELECT SUM(A) OVER (PARTITION BY B ORDER BY C ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) FROM table1
``````
This query computes the sum of every 3 rows, i.e. the sum of a row of interest, its preceding row, and its following row.

In contrast:

``````SELECT SUM(A) OVER (PARTITION BY B ORDER BY C ROWS BETWEEN UNBOUNDED PRECEDING AND 0 FOLLOWING) FROM table1
``````
This query computes the cumulative sum over a row and all of its preceding rows.

Note

For most window functions, BodoSQL returns `NULL` if the specified window frame is empty or all `NULL`. Exceptions to this behavior are noted.

Window functions perform a series of steps as followed:

1. Partition the data by `PARTITION_COLUMN`. This is effectively a groupby operation on `PARTITION_COLUMN`.
2. Sort each group as specified by the `ORDER BY` clause.
3. Perform the calculation over the specified window, i.e. the newly ordered subset of data.
4. Shuffle the data back to the original ordering.

For BodoSQL, `PARTITION BY` is required, but `ORDER BY` is optional for most functions and `ROWS BETWEEN` is optional for all of them. If `ROWS BETWEEN` is not specified then it defaults to either computing the result over the entire window (if no `ORDER BY` clause is specified) or to using the window `UNBOUNDED PRECEDING TO CURRENT ROW` (if there is an `ORDER BY` clause).

Note

`RANGE BETWEEN` is not currently supported.

Note

If a window frame contains `NaN` values, the output may diverge from Snowflake's behavior. When a `NaN` value enters a window, any window function that combines the results with arithmetic (e.g. `SUM`, `AVG`, `VARIANCE`, etc.) will output `NaN` until the `NaN` value has exited the window.

BodoSQL Currently supports the following Aggregation & Window functions:

Function Supported with GROUP BY? Supported without GROUP BY? Supported as window function? (WINDOW) Allows ORDER BY? (WINDOW) Requires ORDER BY? (WINDOW) Allows frame?
`ANY_VALUE` Y Y Y Y N Y
`APPROX_PERCENTILE` N Y Y N N N
`ARRAY_AGG` Y N N N/A N/A N/A
`ARRAY_UNIQUE_AGG` Y N N N/A N/A N/A
`AVG` Y Y Y Y N Y
`BITAND_AGG` Y Y Y N N N
`BITOR_AGG` Y Y Y N N N
`BITXOR_AGG` Y Y Y N N N
`BOOLAND_AGG` Y Y Y N N N
`BOOLOR_AGG` Y Y Y N N N
`BOOLXOR_AGG` Y Y Y N N N
`CONDITIONAL_CHANGE_EVENT` N N Y Y Y N
`CONDITIONAL_TRUE_EVENT` N N Y Y Y N
`CORR` N N Y N N N
`COUNT` Y Y Y Y N Y
`COUNT(*)` Y Y Y Y N Y
`COUNT_IF` Y Y Y Y N Y
`COVAR_POP` N N Y N N N
`COVAR_SAMP` N N Y N N N
`CUME_DIST` N N Y Y Y N
`DENSE_RANK` N N Y Y Y N
`FIRST_VALUE` N N Y Y N Y
`KURTOSIS` Y Y Y N N N
`LEAD` N N Y Y Y N
`LAST_VALUE` N N Y Y N Y
`LAG` N N Y Y Y N
`LISTAGG` Y Y N N/A N/A N/A
`MAX` Y Y Y Y N Y
`MEDIAN` Y Y Y N N N
`MIN` Y Y Y Y N Y
`MODE` Y N Y Y N N
`NTH_VALUE` N N Y Y N Y
`NTILE` N N Y Y Y N
`OBJECT_AGG` Y N Y N N N
`PERCENTILE_CONT` Y Y N N/A N/A N/A
`PERCENTILE_DISC` Y Y N N/A N/A N/A
`PERCENT_RANK` N N Y Y Y N
`RANK` N N Y Y Y N
`RATIO_TO_REPORT` N N Y Y N N
`ROW_NUMBER` N N Y Y Y N
`SKEW` Y Y Y Y N N
`STDDEV` Y Y Y Y N Y
`STDDEV_POP` Y Y Y Y N Y
`STDDEV_SAMP` Y Y Y Y N Y
`SUM` Y Y Y Y N Y
`VARIANCE` Y Y Y Y N Y
`VARIANCE_POP` Y Y Y Y N Y
`VARIANCE_SAMP` Y Y Y Y N Y
`VAR_POP` Y Y Y Y N Y
`VAR_SAMP` Y Y Y Y N Y