Search code examples
pythonpython-3.xpandasnumpyvectorization

Update values in dataframe based on dictionary and condition


I have a dataframe and a dictionary that contains some of the columns of the dataframe and some values. I want to update the dataframe based on the dictionary values, and pick the higher value.

>>> df1

    a   b   c   d   e   f
0   4   2   6   2   8   1
1   3   6   7   7   8   5
2   2   1   1   6   8   7
3   1   2   7   3   3   1
4   1   7   2   6   7   6
5   4   8   8   2   2   1

and the dictionary is

compare = {'a':4, 'c':7, 'e':3}

So I want to check the values in columns ['a','c','e'] and replace with the value in the dictionary, if it is higher.

What I have tried is this:

comp = pd.DataFrame(pd.Series(compare).reindex(df1.columns).fillna(0)).T

df1[df1.columns] = df1.apply(lambda x: np.where(x>comp, x, comp)[0] ,axis=1)

Excepted Output:

>>>df1


    a   b   c   d   e   f
0   4   2   7   2   8   1
1   4   6   7   7   8   5
2   4   1   7   6   8   7
3   4   2   7   3   3   1
4   4   7   7   6   7   6
5   4   8   8   2   3   1

Solution

  • Another possible solution, based on numpy:

    cols = list(compare.keys())
    df[cols] = np.maximum(df[cols].values, np.array(list(compare.values())))
    

    Output:

       a  b  c  d  e  f
    0  4  2  7  2  8  1
    1  4  6  7  7  8  5
    2  4  1  7  6  8  7
    3  4  2  7  3  3  1
    4  4  7  7  6  7  6
    5  4  8  8  2  3  1