Search code examples
pythonpandasdataframemergexor

How to merge/join/combine 2 series in XOR manner while keeping conflicting values


I have the following dataframe, where '' is considered as empty:

df = pd.DataFrame({1: ['a', 'b', 'c']+ ['']*2, 2: ['']*2+ ['d','e', 'f']})
   1  2
0  a  '' 
1  b  '' 
2  c  d
3  '' e
4  '' f

How can I merge/join/combine (I don't know the correct term) col2 into col1 so that I have:

   1  2
0  a ''  
1  b ''  
2  c  d
3  e '' 
4  f '' 

or if I decide to merge col1 into col2:

   1  2
0 ''  a
1 ''  b
2  c  d
3 ''  e
4 ''  f

I would like to be able to decide in which col to merge and the other col should contain the conflict values. Thank you in advance


Solution

  • You can do this using the dataframe method apply():

    Sample data:

    df
       1  2
    0  a   
    1  b   
    2  c  d
    3     e
    4     f
    

    Define arbitrary variables:

    merge_to_column = 2
    other_column = 1
    

    Use apply:

    df['output'] = df.apply(lambda x: x[other_column] if x[merge_to_column] == '' else x[merge_to_column], axis=1)
    

    Output:

    df
      1  2 output
    0  a         a
    1  b         b
    2  c  d      d
    3     e      e
    4     f      f