I have two dataframes, let's say, material inventory reports, for Jan and Feb:
January Report
code description qty_jan amount_jan
WP1 Wooden Part-1 1000 50000
MP1 Metal Part-1 500 5000
GL1 Glass-1 100 2500
February Report
code description qty_feb amount_feb
WP1 Wooden Part-1 1200 60000
MP2 Metal Part-2 300 3000
GL1 Glass-1 50 1250
GL2 Glass-2 200 5000
To monitor the progress of each material inventory, I would like to merge two reports, as follows:
code description qty_jan amount_jan qty_feb amount_feb
WP1 Wooden Part-1 1000 50000 1200 60000
MP1 Metal Part-1 500 5000 0 0
MP2 Metal Part-2 0 0 300 3000
GL1 Glass-1 100 2500 50 1250
GL2 Glass-2 0 0 200 5000
Note: Materials that are not listed in a report, are considered zero stock.
How to merge these two reports?
You can use outer join in DataFrame.merge
and then replace missing values to 0
:
df = df1.merge(df2, on=['code','description'], how='outer').fillna(0)
print (df)
v code description qty_jan amount_jan qty_feb amount_feb
0 WP1 Wooden Part-1 1000.0 50000.0 1200.0 60000.0
1 MP1 Metal Part-1 500.0 5000.0 0.0 0.0
2 GL1 Glass-1 100.0 2500.0 50.0 1250.0
3 MP2 Metal Part-2 0.0 0.0 300.0 3000.0
4 GL2 Glass-2 0.0 0.0 200.0 5000.0
Another idea with concat
:
df = pd.concat([df1.set_index(['code','description']),
df2.set_index(['code','description'])], axis=1).fillna(0).reset_index()
print (df)
code description qty_jan amount_jan qty_feb amount_feb
0 GL1 Glass-1 100.0 2500.0 50.0 1250.0
1 GL2 Glass-2 0.0 0.0 200.0 5000.0
2 MP1 Metal Part-1 500.0 5000.0 0.0 0.0
3 MP2 Metal Part-2 0.0 0.0 300.0 3000.0
4 WP1 Wooden Part-1 1000.0 50000.0 1200.0 60000.0