How to fill '0' value in df1 from unique value from another dataframe (df2). the expected output is no duplicate in df1.
any reference links for this. thank for helping out.
data1 = {'test' :['b',0,'e',0,0,'f']}
df1 = pd.DataFrame(data=data1)
data2 = {'test' :['a','b','c','d','e','f']}
df2 = pd.DataFrame(data=data2)
df1
test
0 b
1 0
2 e
3 0
4 0
5 f
df2
test
0 a
1 b
2 c
3 d
4 e
5 f
expected output:
test
0 b -- df1
1 a -- fill with a from df2
2 e -- df1
3 c -- fill with c from df2
4 d -- fill with d from df2
5 f -- df1
Assuming you have enough unique values in df2
to fill the 0s in df1
, extract those unique values, and assign them with boolean indexing:
# which rows are 0?
m = df1['test'].eq(0)
# extract values of df2 that are not in df1
vals = df2.loc[~df2['test'].isin(df1['test']), 'test'].tolist()
# ['b', 'e', 'f']
# assign the values in the limit of the needed number
df1.loc[m, 'test'] = vals[:m.sum()]
print(df1)
Output:
test
0 b
1 a
2 e
3 c
4 d
5 f
If there is not always enough values in df2
and you want to fill the first possible 0s:
m = df1['test'].eq(0)
vals = df2.loc[~df2['test'].isin(df1['test']), 'test'].unique()
# ['b', 'e', 'f']
m2 = m.cumsum().le(len(vals))
df1.loc[m&m2, 'test'] = vals[:m.sum()]
print(df1)