Search code examples
pandasconditional-statementscalculated-columns

generate conditional column in pandas


I have:

pd.DataFrame({'price':['45p','£1.35']})

I want to convert these to numeric, and get:

pd.DataFrame({'price':['45p','£1.35'],'numeric':[0.45,1.35]})

I tried:

df['numeric']=np.where(df.price.str.contains('p') is True,
                  pd.to_numeric(df.price.str.replace('p',''))/100,
                 pd.to_numeric(df.price.str.replace('£','')))

and got the following error: ValueError: Unable to parse string "£1.35" at position 7

Any suggestions what I am doing wrong?


Solution

  • Try one step at a time:

    # where values in pennies
    is_pennies = df['price'].str.contains('p')
    
    # remove the currency characters and convert to numerics
    df['price'] = df.price.str.replace('p|£', '').astype(float)
    
    # update the values in pennies
    df.loc[is_pennies, 'price'] /= 100
    

    Output:

       price
    0   0.45
    1   1.35