less than 1 minute read

  1. The count() aggregation function counts only non-null values. To count all values, whether null or non-null, use size.

  2. You can specify the names of aggregated columns as the arguments to the agg function. Here I use a dictionary so that I can use string constants for colum names.

     # Series level
     df.groupby("class")["sepal length (cm)"].agg(
         **{
             # 'new column': 'function',
             "sepal_average_length": "mean",
             "sepal_standard_deviation": "std",
         }
     )
    
     # DataFrame level
     df.groupby(["class"]).agg(
         **{
             # 'new column': ('column', 'function'),
             "sepal_average_length": ("sepal length (cm)", "mean"),
             "sepal_standard_deviation": ("sepal length (cm)", "std"),
         }
     )
    

Via Christopher Tao and Soner Yıldırım.

Leave a comment