I have multiple DataFrames that I want to merge where I would like the fill value an empty string rather than nan. Some of the DataFrames have already nan values in them. concat sort of does what I want but fill empty values with nan. How does one not fill them with nan, or specify the fill_value to achieve something like this:
>>> df1
Value1
0 1
1 NaN
2 3
>>> df2
Value2
1 5
2 Nan
3 7
>>> merge_multiple_without_nan([df1,df2])
Value1 Value2
0 1
1 NaN 5
2 3 NaN
3 7
This is what concat does:
>>> concat([df1,df2], axis=1)
Value1 Value2
0 1 NaN
1 NaN 5
2 3 NaN
3 NaN 7
Well, I couldn't find any function in concat or merge that would handle this by itself, but the code below works without much hassel:
df1 = pd.DataFrame({'Value2': [1,np.nan,3]}, index = [0,1, 2])
df2 = pd.DataFrame({'Value2': [5,np.nan,7]}, index = [1, 2, 3])
# Add temporary Nan values for the data frames.
df = pd.concat([df1.fillna('X'), df2.fillna('Y')], axis=1)
df=
Value2 Value2
0 1 NaN
1 X 5
2 3 Y
3 NaN 7
Step 2:
df.fillna('', inplace=True)
df=
Value2 Value2
0 1
1 X 5
2 3 Y
3 7
Step 3:
df.replace(to_replace=['X','Y'], value=np.nan, inplace=True)
df=
Value2 Value2
0 1
1 NaN 5
2 3 NaN
3 7