Search code examples
pythonpandasdataframeformatscientific-notation

How to change the format of .describe() output?


I put .describe() to a Dataframe, the output doesn't look nice. I want the output to show the whole number and not be simplified with exponentials.

Input:

df["A"].describe()

How the output looks like:

count    6.000000e+01
mean     7.123568e+04
std      2.144483e+05
min      1.000000e+02
25%      2.770080e+03
50%      1.557920e+04
75%      4.348470e+04
max      1.592640e+06
Name: A, dtype: float64

Expected Output:

count    60.0
mean     7123.568
std      214448.3
min      100.0000
25%      2770.080
50%      15579.20
75%      43484.70
max      1592640.0
Name: A, dtype: float64

Solution

  • You can change the float_format of pandas in pandas set_option

    import pandas as pd
    import numpy as np
    
    pd.set_option('display.float_format', lambda x: '%.5f' % x)
    
    data = pd.DataFrame()
    
    data['X'] = (np.random.rand(1000, ) + 10000000) * 0.587
    
    data['X'].describe()
    
    # Output 
    count      1000.00000
    mean    5870000.47894
    std           0.28447
    min     5870000.00037
    25%     5870000.23637
    50%     5870000.45799
    75%     5870000.71652
    max     5870000.99774
    Name: X, dtype: float64
    

    Or without using set_option use apply over the output series like this

    import pandas as pd
    import numpy as np
    
    data = pd.DataFrame()
    
    data['X'] = np.random.rand(1000, ) + 10000000 * 0.587
    
    data['X'].describe().apply("{0:.5f}".format)
    
    #output
    
    count       1000.00000
    mean     5870000.48955
    std            0.29247
    min      5870000.00350
    25%      5870000.22416
    50%      5870000.50163
    75%      5870000.73457
    max      5870000.99995