Search code examples
pythonpandasdataframenumpyanalytics

Get rows of a pandas dataframe where one the values of a combination of two columns are greater than or equal to those given in a list


I have the following dataframe:

df = pd.DataFrame({"a": [1,1,1,2,2,2,3,3,3,4,4,4], "b": [4,5,6,4,5,6,4,5,6,4,5,6]})

https://i.sstatic.net/SrG0e.png

I have been given the following list:

l = ["15", "24", "36", "45"]

I need to filter the rows of df based on the list. For example, I need all the rows where the value in column "a" is 1 and the values in columns "b" is greater than or equal to 5, the rows where the value in column "a" is 2 and the values in columns "b" is greater than or equal to 4, the rows where the value in column "a" is 3 and the values in columns "b" is greater than or equal to 6 and so on.

So, the output would look like, https://i.sstatic.net/RJ7HA.png

I can use iteration but I believe there must be better way in pandas. Any help is greatly appreciated.


Solution

  • You can use a mapping Series/dictionary with map and boolean indexing:

    s = pd.Series({int(x[0]): int(x[1]) for x in l})
    # s = {int(x[0]): int(x[1]) for x in l}
    
    out = df[df['b'].ge(df['a'].map(s))]
    

    Output:

        a  b
    1   1  5
    2   1  6
    3   2  4
    4   2  5
    5   2  6
    8   3  6
    10  4  5
    11  4  6