Search code examples
pythonstatisticssummary

Trying to replicate descriptive statistics analysis tools excel in python / add mode to describe() function


Im trying to replicate descriptive statistics (summary statistics) analysis tool in excel with python (jupyter notebook) by aggregating some of descriptive statistics availbale in pandas library, but everytime i add mode function in the code, it always return :

ValueError: cannot combine transform and aggregation operations

my code is :

df2 = df[["pm10","so2", "co", "o3", "no2" ]]
df2.agg(
    {
        "pm10": ["mean", "sem", "median", "std", "var", "kurt", "skew", "min", "max", "sum", "count", "mode"],
        "so2": ["mean", "sem", "median", "std", "var", "kurt", "skew", "min", "max", "sum", "count", "mode"],
        "co": ["mean", "sem", "median", "std", "var", "kurt", "skew", "min", "max", "sum", "count", "mode"],
        "o3": ["mean", "sem", "median", "std", "var", "kurt", "skew", "min", "max", "sum", "count", "mode"],
        "no2": ["mean", "sem", "median", "std", "var", "kurt", "skew", "min", "max", "sum", "count", "mode"]
    }
  )

it only return error when icluding mode function, the other function work well. thi is my dataset

the result that i want : i want mode to be aggregated


Solution

  • Try using mode function from statistics package:

    from statistics import mode
    func_list = ["mean", "sem", "median", "std", "var", "kurt", "skew", "min", "max", "sum", "count", mode]
    df2.agg(
    {
        "pm10": func_list,
        "so2": func_list,
        "co": func_list,
        "o3": func_list,
        "no2": func_list
    })