Search code examples
pythonpython-3.xpandasdataframeapply

How to apply pandas.map() where the function takes more than 1 argument


Suppose I have a dataframe containing a column of probability. Now I create a map function which returns 1 if the probability is greater than a threshold value, otherwise returns 0. Now the catch is that I want to specify the threshold by giving it as an argument to the function, and then mapping it on the pandas dataframe.

Take the code example below:

def partition(x,threshold):
    if x<threshold:
        return 0
    else:
        return 1

df = pd.DataFrame({'probability':[0.2,0.8,0.4,0.95]})
df2 = df.map(partition)

My question is, how would the last line work, i.e. how do I pass the threshold value inside my map function?


Solution

  • We can use Dataframe.applymap

    df2 = df.applymap(lambda x: partition(x, threshold=0.5))
    

    Or if only one column:

    df['probability']=df['probability'].apply(lambda x: partition(x, threshold=0.5))
    

    but it is not neccesary here. You can do:

    df2 = df.ge(threshold).astype(int)
    

    I recommend you see it