I have two multi-indexed dataframes df1
with three levels and df2
with two. The indices resulted from df1.groupby([col_1, col_2, col_3])
and df2.groupby([col_1, col_2])
. col_1
and col_2
are the same in both dataframes, but, because of the third level in df1
of different lenghts; df1
has 2425 rows and df2
783.
What I'm trying to do is to merge both dataframes so that df2
gets spread up that the length of indices level 0 and 1 are of the same length in df1
and df2
so that the resulting dataframe is also of 2425 rows.
I used df3 = df1.merge(df2, left_index=True, right_index=True)
but the resulting dataframe remains with only 2385 rows.
I used df3 = pd.concat([df1, df2], axis=1)
but it raised a ValueError: operands could not be broadcast together with shapes
.
Is there an elegant way to solve this? I appreciate every help
EDIT: data sample
df1:
Areaclccat1990 ... Areaclccat2012
FID_Weser_Catchments_134_WQ_Stations_FINAL_LAEA... SNR1 gridcode_1 ...
0 3152 1 0.002764 ... 0.007248
2 0.980105 ... 0.972941
3 0.005049 ... 0.017166
4 0.012082 ... 0.002645
3155 1 NaN ... 0.000003
2 1.000000 ... 0.996788
3 NaN ... 0.003209
3255 1 NaN ... 0.058950
2 0.989654 ... 0.941050
4 0.010346 ... NaN
5958 1 NaN ... 0.004463
2 0.955098 ... 0.958452
3 0.014408 ... 0.027835
4 0.030494 ... 0.009250
5966 1 0.007184 ... 0.011448
2 0.955668 ... 0.949824
3 0.037148 ... 0.038728
5970 1 NaN ... 0.001141
2 0.979750 ... 0.930495
3 0.011281 ... 0.068364
df2:
Areaclccat1990 ... Areaclccat2012
FID_Weser_Catchments_134_WQ_Stations_FINAL_LAEA... SNR1 ...
0 3152 1654.636456 ... 1550.415658
3155 1820.433231 ... 1758.125539
3255 43.056576 ... 39.436385
5958 2306.806057 ... 2120.791289
5966 7.444977 ... 5.763853
5970 3087.717009 ... 2615.253450
6435 240.342745 ... 255.033888
6534 647.293171 ... 621.116222
6535 9929.136397 ... 9653.021903
6611 947.912232 ... 754.783147
6631 13528.073523 ... 13545.356498
6632 14023.097062 ... 13897.394309
6633 5913.895620 ... 5398.585720
6634 17463.795952 ... 17159.138628
6635 10791.618411 ... 10306.725199
6636 9664.138661 ... 9742.442935
9473 131.268559 ... 128.477078
9672 107.831005 ... 102.464959
9673 13.044806 ... 29.566828
16051 443.810802 ... 428.493495
Coonvert third level to column before merge
with how='left'
for left join:
df3 = df1.reset_index(level=2).merge(df2, left_index=True, right_index=True, how='left')