Search code examples
pythonpandasdata-sciencedata-manipulation

Pandas map between two dataframes into column


Let's say that I have a df1 like (there are more columns but only this one is relatable):

A
a1
a2
a3

and a df2 like:

A
a1
a3
a4
a7

The case is that df2 contains in column A (column names are the same both in df1 and df2) some of the values in df1, but not all of them. Now, I'd like to add a column "Found in df2?" to a df1, representing if the value was found or not. Example:

df1
A  Found in df2?
a1       Y
a2       N
a3       Y

I've tried np.where and some merging magic but couldn't wrap my head around this.


Solution

  • You can use isin:

    df['found in df2'] = df['A'].isin(df2['A'].values)
    
    print(df)
    
        A   found in df2
    0   a1  True
    1   a2  False
    2   a3  True
    

    Setup

    df = pd.DataFrame({'A':['a1','a2','a3']})
    df2 = pd.DataFrame({'A':['a1','a3','a4','a7']})