The fill_value
argument of pandas.DataFrame.multiply()
fills missing values in both dataframes. However, I'd like to have only missing values filled in the 2nd DataFrame. What would be a good way beyond my hacky solution below?
>>> df1 = pd.DataFrame({'a':[1, np.nan, 2], 'b':[np.nan, 3, 4]}, index = [1, 2, 3])
>>> df1
a b
1 1.0 NaN
2 NaN 3.0
3 2.0 4.0
>>> df2 = pd.DataFrame({'a':[2, np.nan], 'b':[3, np.nan], 'c':[1, 1]}, index = [1, 2])
>>> df2
a b c
1 2.0 3.0 1.0
2 NaN NaN 1.0
I would like to multiply the two DataFrames element-wise, by keeping df1
as the dominant one so that the resulting shape and NaN
entries should match df1
, while filling NaN
s in df2
by value 1, to get
a b
1 2.0 NaN
2 NaN 3.0
3 2.0 4.0
The naive solution doesn't work:
>>> df1.multiply(df2, fill_value=1)
a b c
1 2.0 3.0 1.0
2 NaN 3.0 1.0
3 2.0 4.0 NaN
My hacky solution is to create a matrix with 1
s where df1
has value, and update by df2
>>> df3 = df1/df1
>>> df3.update(df2)
>>> df3
a b
1 2.0 3.0
2 NaN 1.0
3 1.0 1.0
>>> df1.multiply(df3)
a b
1 2.0 NaN
2 NaN 3.0
3 2.0 4.0
It just doesn't feel very elegant. Any cool idea on direct manipulation with df1
and df2
, hopefully a one-liner?
You can use reindex
and fillna
on df2
:
df1.multiply(df2.reindex(df1.index).fillna(1))
a b
1 2.0 NaN
2 NaN 3.0
3 2.0 4.0
You don't need to explicitly call multiply
in this case, and can just use *
for multiplication:
df1 * df2.reindex(df1.index).fillna(1)
a b
1 2.0 NaN
2 NaN 3.0
3 2.0 4.0
Additionally, if you need to align the columns of df2
with df1
, use the columns
parameter of reindex
:
df1 * df2.reindex(index=df1.index, columns=df1.columns).fillna(1)