I have data from an excel sheet I have summarized in a pandas crosstab. I want to categorize the data further by summing related rows.
Here is my crosstab:
class_of_orbit Elliptical GEO LEO MEO All users Civil 0 0 36 0 36 Civil/Government 0 0 2 0 2 Commercial 3 99 412 0 514 Government 9 14 38 0 61 Government/Civil 0 0 10 0 10 Government/Commercial 0 2 81 0 83 Government/Military 0 0 1 0 1 Military 9 67 66 0 142 Military/Civil 0 0 2 0 2 Military/Commercial 0 0 0 32 32 All 21 182 648 32 883
I only want 4 groups: civil, govt,commercial, and military. If "Government" is in the name, I want to sum all the rows that contain it. If "Military" is in the name I want to sum the rows into a military row....
What is the best way to do this?
pd.crosstab
Do it from the start
pd.crosstab(df.users.str.split('/').str[0], df.class_of_orbit)
groupby
On top of what you already have. If you pass a callable to groupby
it will apply that to the index and use the result to group by.
xtab.groupby(lambda x: x.split('/')[0]).sum()
Elliptical GEO LEO MEO All
All 21 182 648 32 883
Civil 0 0 38 0 38
Commercial 3 99 412 0 514
Government 9 16 130 0 155
Military 9 67 68 32 176