Search code examples
pythonpython-3.xpandascrosstab

How do I sum rows inside of pandas crosstab and make a new crosstab?


I have data from an excel sheet I have summarized in a pandas crosstab. I want to categorize the data further by summing related rows.

Here is my crosstab:

class_of_orbit         Elliptical  GEO  LEO  MEO  All
users
Civil                           0    0   36    0   36
Civil/Government                0    0    2    0    2
Commercial                      3   99  412    0  514
Government                      9   14   38    0   61
Government/Civil                0    0   10    0   10
Government/Commercial           0    2   81    0   83
Government/Military             0    0    1    0    1
Military                        9   67   66    0  142
Military/Civil                  0    0    2    0    2
Military/Commercial             0    0    0   32   32
All                            21  182  648   32  883

I only want 4 groups: civil, govt,commercial, and military. If "Government" is in the name, I want to sum all the rows that contain it. If "Military" is in the name I want to sum the rows into a military row....

What is the best way to do this?


Solution

  • pd.crosstab

    Do it from the start

    pd.crosstab(df.users.str.split('/').str[0], df.class_of_orbit)
    

    groupby

    On top of what you already have. If you pass a callable to groupby it will apply that to the index and use the result to group by.

    xtab.groupby(lambda x: x.split('/')[0]).sum()
    
                Elliptical  GEO  LEO  MEO  All
    All                 21  182  648   32  883
    Civil                0    0   38    0   38
    Commercial           3   99  412    0  514
    Government           9   16  130    0  155
    Military             9   67   68   32  176
    ​