Groupby
Groupby, often written as group by in SQL and as groupby in programming libraries, is a data processing operation that partitions a dataset into subgroups based on one or more keys and then applies a calculation to each subgroup. The result is a table of summary values, with one row per group, illustrating metrics such as totals, means, or counts across defined categories.
In SQL, GROUP BY groups rows by the specified expressions. Each distinct value (or combination) of the
In data analysis libraries, groupby constructs manage the grouping logic and support subsequent operations such as
Common aggregations include sum, mean, median, count, and standard deviation. Grouping is used for categorical, temporal,
Related concepts include windowed operations that compute values within a group across a sequence, and the