So, I have the following two Multiindex-Dfs:
data = {('California', 0): 'LA',
('California', 1): 'SF',
('Texas', 0): 'HO',
('New York', 0): 'BX',
('New York', 1): 'NY'}
df= pd.Series(data)
df = pd.DataFrame(df)
df
#needs column name
df.index.names = ['state', 'Idx']
df
Which returns:
0
state Idx
California 0 LA
1 SF
Texas 0 HO
New York 0 BX
1 NY
And the second:
data2 = {('California', 0): '20%',
('California', 2): '30%'}
df2 = pd.Series(data2)
df2 = pd.DataFrame(df2)
df2
df2.index.names = ['state', 'X']
df2
Which returns:
X 0
state
California 0 20%
2 30%
Now: Id'like to combine the two, so that it looks like the following.
The tricky bit: The df Idx 0 and 1 have to be combined with 0 and 2 in the df2. What do I do?
0
state Idx
California 0 LA 20%
1 SF 30%
Texas 0 HO
New York 0 BX
1 NY
Thanks already!
You could use:
(pd.concat([df.unstack('state'),
df2.unstack('state').rename(columns = {0:1}).reset_index(drop=True)]
,axis=1)
.stack().swaplevel().sort_index(level =0))
or
df.join(df2.reset_index().drop(columns = 'X')
.set_index('state',append = True)
.rename(columns = {0:1})
.swaplevel()
.rename_axis(index = [*df.index.names])
)
# 0 1
#state Idx
#California 0 LA 20%
# 1 SF 30%
#Texas 0 HO NaN
#New York 0 BX NaN
# 1 NY NaN