I am trying to add a new column with subtotals and a final column with totals. For example,
df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo", "bar", "bar", "bar", "bar"],
"B": ["one", "one", "one", "two", "two","one", "one", "two", "two"],
"C": ["small", "large", "large", "small","small", "large", "small", "small", "large"],
"D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
"E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
i.e:
A B C D E
0 foo one small 1 2
1 foo one large 2 4
2 foo one large 2 5
3 foo two small 3 5
4 foo two small 3 6
5 bar one large 4 6
6 bar one small 5 8
7 bar two small 6 9
8 bar two large 7 9
Now I pivot:
table = pd.pivot_table(df, values=['D',"E"], index=['A'],columns=['C'])
and add the totals:
table['total'] = table.sum(axis=1)
for t in ["D", "E"]:
table[t, "partial_total"] = table[t].sum(axis=1)
While this numerically works, visually it's annoying. I would like to have all the data for D
(including the partial_total
), then E
, then the total
. Here's my resulting df:
D E total D E
C large small large small partial_total partial_total
A
bar 5.5 5.500000 7.5 8.500000 27.000000 11.000000 16.000000
foo 2.0 2.333333 4.5 4.333333 13.166667 4.333333 8.833333
so
how do I group together the values for the same (top level) columns?
Trying to perform operations before the pivot_table
g = df.groupby(['A', 'C'])[['D', 'E']]
d = (g.sum()/g.count()).reset_index()
m = d.groupby('A', as_index=False).sum().assign(C='partial')
final = pd.concat([m, d]).pivot_table(index='A', columns='C')
D E
C large small partial large small partial
A
bar 5.5 5.500000 11.000000 7.5 8.500000 16.000000
foo 2.0 2.333333 4.333333 4.5 4.333333 8.833333
To answer specifically your last question
how do I group together the values for the same (top level) columns?
You may just sort_index
table.sort_index(axis=1)
D E total
C large partial_total small large partial_total small
A
bar 5.5 11.000000 5.500000 7.5 16.000000 8.500000 27.000000
foo 2.0 4.333333 2.333333 4.5 8.833333 4.333333 13.166667