Search code examples
pythonpandasexceldataframepandas-styles

How to highlight rows or string in one dataframe based off the values of a column in a different datafame?


I want to be able take vales from one column of dataframe and find the exact same values in another dataframe to either highlight the whole row or just the string (whatever is easier/possible).

dataset1 = {'Gene Name':['ampC','fliC','ompC','fruB','yobH','gltP','ruvA','yacC','folD'],
       'FoldChange':['4.54','5.65','7.89','6.45','10.67','4.63','2.65','9.45','5.79'],
        'pvalue':['0','0','0','0','0','0','0','0','0']}

df1= pd.DataFrame(dataset1)

dataset2 = {'Gene Name':['gltP','ruvA','yacC','folD']}

df2 = pd.DataFrame(dataset2)

If it is possible I would like to make something that would look like either of the bottom examples.

What I would like my dataset to look like

I've tried a bunch of different stuff but I'm still learning python so not sure how to do this.


Solution

  • You can check if the genes of df1 isin those of df2, then apply your CSS style :

    m = df1["Gene Name"].isin(df2["Gene Name"])
    
    def fn(ser, how):
        import numpy as np
        return np.where(m, f"{how}: red", "")
        
    # choose one of the two
    left = df1.style.apply(fn, how="color", subset=["Gene Name"])
    right = df1.style.apply(fn, how="background-color")
    
    # .to_excel("output.xlsx", index=False) # to save an Excel
    

    Preview of both :