python dataframe substring concatenation

Compare two dataframes using substrings in each cell, keep data in first dataframe if it is in corresponding spot in second dataframe

I have two dataframes. Both have the same index and same column labels. I want to compare them by substring and cell location.

data_1 = {'column A': ['AA (1)', 'nan', 'nan', 'DD (6)', 'nan'], 
          'column B': ['AA (8)', 'nan', 'CC (3)', 'nan', 'nan'], 
          'column C': ['AA (9)', 'nan', 'CC (7)', 'DD (5)', 'nan']}  
df_1 = pd.DataFrame(data_1)  
data_2 = {'column A': ['AA', 'BB', 'nan', 'nan', 'nan'], 
          'column B': ['nan', 'nan', 'CC', 'nan', 'EE'], 
          'column C': ['nan', 'BB', 'CC', 'DD', 'nan']}  
df_2 = pd.DataFrame(data_2)

The first dataframe contains the data I want to keep, given that the first two characters are the same and in the same location in the second dataframe. For example, compare 'column A' between both dataframes. Keep only 'AA (1)' in column one since the 'AA' substring is in both dataframes and in the same location, but drop 'DD (6)'.

Here is the desired output:

data_3 = {'column A': ['AA (1)', 'nan', 'nan', 'nan', 'nan'], 
          'column B': ['nan', 'nan', 'CC (3)', 'nan', 'nan'], 
          'column C': ['nan', 'nan', 'CC (7)', 'DD (5)', 'nan']}  
df_3 = pd.DataFrame(data_3)

Solution

def match(df_11, df_2):
    new_df = pd.DataFrame([])
    concat_df = pd.concat([df_1, df_2], axis=1)
    for i in df_1.columns:
        new_df[i] = concat_df[i].apply(
            lambda x: ['nan', x[0]][x[1].startswith(x[0][:2])], axis=1)
    return new_df


df_3 = match(df_1, df_2)
print(df_3)