Let's say I have the following multi-indexed dataframe, generated with the following code:
import pandas as pd, numpy as np
names = ['Name1','Name2','Name3','Name4']
values = ['x1','x2','x3','x4']
categories = ['y1','y2','y3']
x1 = pd.Series([0, 0, 0], index=categories)
index = pd.MultiIndex.from_product([names, values]); placeholders = np.zeros((len(names)*len(values), len(categories)))
df = pd.DataFrame(placeholders, index=index, columns=categories)
for i in names:
for j in values:
df.loc[i,j] = x1
y1 y2 y3
Name1 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name2 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name3 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name4 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
How would I use a dictionary such as the following to fill the column y3
corresponding to row x1
and filling np.nan
for values for any other rows in the y3
column or when a given Name (i.e. Name1, Name2, Name3, etc.) is not a key in the dictionary?
{'Name1': 54, 'Name3': 50}
Expected output (0's could be np.nan
):
y1 y2 y3
Name1 x1 0.0 0.0 54
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name2 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name3 x1 0.0 0.0 50
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name4 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
The idiomatic way is probably to use update
In [31]: df2 = pd.DataFrame({(k,'x1'): {'y3': v} for k, v in d.items()}).T
In [32]: df2
Out[32]:
y3
Name1 x1 54
Name3 x1 50
In [33]: df.update(df2)
In [34]: df
Out[34]:
y1 y2 y3
Name1 x1 0.0 0.0 54.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name2 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name3 x1 0.0 0.0 50.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0
Name4 x1 0.0 0.0 0.0
x2 0.0 0.0 0.0
x3 0.0 0.0 0.0
x4 0.0 0.0 0.0