I am trying to run analysis on a merged dataframe. Currently, my data is something like this:
data2 = pd.DataFrame({'county':['A', 'B', 'C', 'D'],
'1990pop': [100, 200, 150, 400],
'1990price': [35, 20, 10, 60],
'1991pop': [110, 210, 160, 410],
'1991price': [7, 9, 8, 15]})
data2.set_index('county')
I want to create a MultiIndex so for for each row, you see the county, and then each county has an entry for a 'population' and a 'price' variable, and only one column for each for each year 1990 and 1991.
Here's what I'd like the dataframe to look like
I've tried various merging and MultiIndex guides but this is a bit beyond my ability. I'd greatly appreciate any help!
Since years are safely 4 digits, I would create a MultiIndex on the columns, then stack
:
data2.columns = pd.MultiIndex.from_arrays([data2.columns.str[0:4], data2.columns.str[4:]],
names=[None, 'Type'])
data2.stack(1)
1990 1991
county Type
A pop 100 110
price 35 7
B pop 200 210
price 20 9
C pop 150 160
price 10 8
D pop 400 410
price 60 15