Search code examples
pythonpandasdata-analysis

Calculating the mean of groups in python/pandas


My grouped data looks like:

deviceid                                  time    
01691cbb94f16f737e4c83eca8e5f5e5390c2801  January       10
022009f075929be71975ce70db19cd47780b112f  April        566
                                          August       210
                                          January        4
                                          July         578
                                          June        1048
                                          May         1483
02bad1cdf92fbaa9327a65babc1c081e59fbf435  November     309
                                          October       54

Where the last column represents the count. I obtained this grouped representation using the expression:

data1.groupby(['deviceid', 'time'])

How do I get the average for each device id, i.e., the sum of the counts of all months divided by the number of months? My output should look like:

deviceid                                  mean    
01691cbb94f16f737e4c83eca8e5f5e5390c2801  10
022009f075929be71975ce70db19cd47780b112f  777.8
02bad1cdf92fbaa9327a65babc1c081e59fbf435  181.5

Solution

  • You an specify the level in the mean method:

    s.mean(level=0)  # or: s.mean(level='deviceid')
    

    This is equivalent to grouping by the first level of the index and taking the mean of each group: s.groupby(level=0).mean()