Pandas: Some notes on groupby
-
The
count()
aggregation function counts only non-null values. To count all values, whether null or non-null, usesize
. -
You can specify the names of aggregated columns as the arguments to the
agg
function. Here I use a dictionary so that I can use string constants for colum names.# Series level df.groupby("class")["sepal length (cm)"].agg( **{ # 'new column': 'function', "sepal_average_length": "mean", "sepal_standard_deviation": "std", } ) # DataFrame level df.groupby(["class"]).agg( **{ # 'new column': ('column', 'function'), "sepal_average_length": ("sepal length (cm)", "mean"), "sepal_standard_deviation": ("sepal length (cm)", "std"), } )
Via Christopher Tao and Soner Yıldırım.
Leave a comment