Search code examples
pythonpandasfrequency

get the most frequent group of values with pandas in python


I have an ascii file as following

1   306.0416667
2   286.1666667
3   207.5
4   226.4166667
5   304.2083333
6   336.1666667
7   255.5416667
8   224.5833333
9   190.1666667
10  163.5
11  231.125
12  167.3333333
13  193.5416667
14  165
15  166
16  172.173913
17  158.9166667
18  196.8333333
19  154.875
20  303.4166667

I want to found the most frequent group of values. The groups are 0-90, 90-180, 180-270, 270-360.

I tried to use .value_counts() but with no success (even though without grouping the values).

import pandas as pd
col_names=['id','val']
df = pd.read_csv(i,names=col_names,header=None)
df['val'].value_counts().[:1].index.tolist()

Solution

  • You can use pd.cut, groupby(), count() like below:

    >>> df = pd.DataFrame({
        'freq': [306.0416667, 286.1666667, 207.5 , 226.4166667 , 304.2083333 , 
                 336.1666667 , 255.5416667, 224.5833333 , 190.1666667, 163.5 , 
                 231.125, 167.3333333 , 193.5416667 , 165 , 154.875 , 303.4166667]})
    
    >>> ranges = [0,90,180,270, 360]
    >>> df.groupby(pd.cut(df['freq'], ranges)).count()
    
               freq
    freq    
    (0, 90]     0
    (90, 180]   4
    (180, 270]  7
    (270, 360]  5
    
    >>> df.groupby(pd.cut(df['freq'], ranges)).count().idxmax()
    freq    (180, 270]
    dtype: interval