Search code examples
pythonpandas

Modify output from Python Pandas describe


Is there a way to omit some of the output from the pandas describe? This command gives me exactly what I want with a table output (count and mean of executeTime's by a simpleDate)

df.groupby('simpleDate').executeTime.describe().unstack(1)

However that's all I want, count and mean. I want to drop std, min, max, etc... So far I've only read how to modify column size.

I'm guessing the answer is going to be to re-write the line, not using describe, but I haven't had any luck grouping by simpleDate and getting the count with a mean on executeTime.

I can do count by date:

df.groupby(['simpleDate']).size()

or executeTime by date:

df.groupby(['simpleDate']).mean()['executeTime'].reset_index()

But can't figure out the syntax to combine them.

My desired output:

            count  mean  
09-10-2013      8  20.523   
09-11-2013      4  21.112  
09-12-2013      3  18.531
...            ..  ...

Solution

  • Describe returns a series, so you can just select out what you want

    In [6]: s = Series(np.random.rand(10))
    
    In [7]: s
    Out[7]: 
    0    0.302041
    1    0.353838
    2    0.421416
    3    0.174497
    4    0.600932
    5    0.871461
    6    0.116874
    7    0.233738
    8    0.859147
    9    0.145515
    dtype: float64
    
    In [8]: s.describe()
    Out[8]: 
    count    10.000000
    mean      0.407946
    std       0.280562
    min       0.116874
    25%       0.189307
    50%       0.327940
    75%       0.556053
    max       0.871461
    dtype: float64
    
    In [9]: s.describe()[['count','mean']]
    Out[9]: 
    count    10.000000
    mean      0.407946
    dtype: float64