Search code examples
pythonpandasmergealignmentconcatenation

Merge unaligned DataFrames while filling with empty string


I have multiple DataFrames that I want to merge where I would like the fill value an empty string rather than nan. Some of the DataFrames have already nan values in them. concat sort of does what I want but fill empty values with nan. How does one not fill them with nan, or specify the fill_value to achieve something like this:

>>> df1
    Value1
0       1
1     NaN
2       3

>>> df2
    Value2
1       5
2       Nan
3       7

>>> merge_multiple_without_nan([df1,df2])
    Value1    Value2
0       1     
1     NaN       5
2       3       NaN
3               7

This is what concat does:

>>> concat([df1,df2], axis=1)
   Value1  Value2
0       1     NaN
1     NaN       5
2       3     NaN
3     NaN       7

Solution

  • Well, I couldn't find any function in concat or merge that would handle this by itself, but the code below works without much hassel:

    df1 = pd.DataFrame({'Value2': [1,np.nan,3]}, index = [0,1, 2])
    df2 = pd.DataFrame({'Value2': [5,np.nan,7]}, index = [1, 2, 3])
    # Add temporary Nan values for the data frames.
    df = pd.concat([df1.fillna('X'), df2.fillna('Y')], axis=1)
    df=
      Value2 Value2
    0      1    NaN
    1      X      5
    2      3      Y
    3    NaN      7
    

    Step 2:

    df.fillna('', inplace=True)
    df=
      Value2 Value2
    0      1
    1      X      5
    2      3      Y
    3             7  
    

    Step 3:

    df.replace(to_replace=['X','Y'], value=np.nan, inplace=True)
    df=
      Value2 Value2
    0      1
    1    NaN      5
    2      3    NaN
    3             7