Basically I have data that looks like this, with the data grouped by Company and Program and the top 3 Programs by Revenue listed for each Company:
That I would like to look like this, where the Companies are on the rows still, but its top 3 Programs are now in their own column, no need to keep the Revenue values in the visible output:
I've tried a few solutions but it wasn't shaping quite the way I expected and I'm having difficulty finding the exact solution online
I'm new to Python so any help would be greatly appreciated!
I heavily abbreviated your data frame images. In the future please do not upload images of code but instead give the text itself. Many corporate networks disable image loading.
df = pd.DataFrame(
{'c': ['A', 'A', 'A', 'B', 'B', 'B'], 'p': list('WCXANV'),
'r': [10, 5, 3, 13, 7, 1]})
Create descending ranks within each company:
df['t'] = df.sort_values(by=['c', 'r'], ascending=False) \
.assign(t=1).groupby('c')['t'].cumcount() + 1
Stack on company and rank then unstack rank and reset indices.
>>> df.set_index(['c', 't'])['p'].unstack().reset_index()
t c 1 2 3
0 A W C X
1 B A N V
I usually think geometrically though so, for me, stack
and unstack
operations are very intuitive. Alternatively use pd.pivot
:
>>> df.pivot(index='c', columns='t', values='p')
t 1 2 3
c
A W C X
B A N V