Search code examples
pythonpandasdataframecountextract

Extract count of single category from a pandas DataFrame


I currently have a DataFrame containing info on e-mails sent from one job title to another.

      fromJobtitle         toJobtitle  e-mails
0              CEO                CEO       65
1              CEO           Director       23
2              CEO           Employee       56
3              CEO    In House Lawyer        7
4              CEO            Manager      104
..             ...                ...      ...
87  Vice President  Managing Director      112
88  Vice President          President      385
89  Vice President             Trader       78
90  Vice President            Unknown     1088
91  Vice President     Vice President     2304

And I am looking for a way so that it is possible to get a total count for each job title. The example output would be:

        totalJobtitle       e-mails
0                 CEO           670
1   Managing Director          2341
2      Vice President          4720
3            Employee          3560
4              Trader           250

Solution

  • a small example of what I could work with

    d = {'fromJobtitle': ["CEO", "CEO","VicePresident","VicePresident"], 'mail': [3, 4, 5, 6 ]}
    df = pd.DataFrame(data=d)
    

    df:

        fromJobtitle    mail
    0   CEO 3
    1   CEO 4
    2   VicePresident   5
    3   VicePresident   6
    

    now this:

     df = pd.pivot_table(df, index=['fromJobtitle'],values=['mail'],aggfunc=np.sum)
    

    df:

    fromJobtitle    mail    
    CEO 7
    VicePresident   11
    

    the source of the function: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html