# (TIL) Pandas: Complex aggregation expressions

``````>>> df = pd.DataFrame(np.random.randn(10,3),columns=list('ABC'))

>>> df['D'] = [1, 1, 1, 2, 2, 2, 3, 3, 3, 3, ]

# Define the aggregation calculations
>>> aggregations = {
# Work on the "A" column.
'A': {
'total_A': 'sum',  # get the sum, and call this result 'total_A'
'average_A': 'mean', # get mean, call result 'average_A'
'num_A': 'count'
},
# Work on the "B" column.
'B': {
'max_B': 'max',   # Find the max, call the result "max_B"
'min_B': 'min',
'range_B': lambda x: max(x) - min(x)
},
# Calculate two results for the 'C' column with a list of aggregation functions.
'C': ["count", "max"]
}

# Perform groupby aggregation by column "D".
>>> df.groupby('D').agg(aggregations)
B                         C                   A
min_B     max_B   range_B count       max average_A   total_A num_A
D
1 -0.986305  1.478065  2.464370     3 -0.158469 -0.516790 -1.550369     3
2  0.390151  1.661266  1.271115     3  1.150179 -0.765209 -2.295627     3
3 -0.658708  1.769680  2.428388     4  0.875891  0.007827  0.031307     4

``````

Via shanelynn.ie .

Tags:

Categories:

Updated: