Assuming I have a DF as follows:
df = pd.DataFrame({'legs': [2, 4, 8, 0],
'wings': [2, 0, 0, 0],
'specimen': [10, 2, 1, 8]},
index=['falcon', 'dog', 'spider', 'fish'])
df
resulting in:
Now I get data in the form of a dict and I would like to add it
new_data = {'dog':{'wings':45,'specimen':89},'fish':{'wings':55555,'something_new':'new value'}, 'new_row':{'wings':90}}
new_data_df = pd.DataFrame(new_data).T
new_data_df
I can use append to add the data to the first DF, but append will be deprecated, so I rather stay away. I can use concat as in here: append dictionary to data frame
output = pd.concat([df, new_data_df], ignore_index=False)
output
but this does not give the result required:
I dont want row index to be duplicated. I would like that the data is overwriting and added when a new column or row appears in the dict. There should be one and only one dog index column. As you see in the above screenshot the row dog appears two times.
changing ignore_index=False to True does not help, the index simple is skipped.
You may check with combine_first
out = new_data_df.combine_first(df)
Out[144]:
legs something_new specimen wings
dog 4.0 NaN 89.0 45.0
falcon 2.0 NaN 10.0 2.0
fish 0.0 new value 8.0 55555
new_row NaN NaN NaN 90.0
spider 8.0 NaN 1.0 0.0