Search code examples
pythonpandaslistdataframenp

check the existence of value on other Dataframe


I have two data frames F1 and F2 containing both the column id1, id2.

F1 contains two columns F1[id1,id2].
F2 contains three column [id1,id2,Description] I wante to test if F2['id1']exists in F1['id1'] OR F2['id2']exists inF1['id2'] then i must addd a colmun in F1 with Description of this id1 or id2 in F2` . The contens of F1 and F2 are are HERE. The Output That im attending on F1 is also HERE I created F1 and F2 like This

     F1 = {'id1': ['x22', 'x13','NaN','x421'],'id2':['NaN','223','788','NaN']}
     F1 = pd.DataFrame(data=F1)
     F2 = {'id1': ['x22', 'NaN','NaN','x413','x421'],'id2':['NaN','223','788','NaN','233'],'Description':['California','LA','NY','Havnover','Munich']}
     F2 = pd.DataFrame(data=F2)

Actually i tried several solutions . But nothing helps me to do it . Help please


Solution

  • Use:

    #if necessary replace string NaN to missing values
    F1 = F1.replace('NaN', np.nan)
    F2 = F2.replace('NaN', np.nan)
    

    For each column remove duplicates by DataFrame.drop_duplicates, missing valeus by DataFrame.dropna and create index by id column by DataFrame.set_index:

    s1 = F2.drop_duplicates('id1').dropna(subset=['id1']).set_index('id1')['Description']
    s2 = F2.drop_duplicates('id2').dropna(subset=['id2']).set_index('id2')['Description']
    

    Then use Series.map by columns from F1 with Series.combine_first:

    F1['Description'] = F1['id1'].map(s1).combine_first(F1['id2'].map(s2))
    print (F1)
        id1  id2 Description
    0   x22  NaN  California
    1   x13  223          LA
    2   NaN  788          NY
    3  x421  NaN      Munich