Search code examples
pythonpandasdata-sciencedata-cleaningdata-profiling

How to detect and convert units of column values without using python loop?


As per my knowledge Python loops are slow, hence it is preferred to use pandas inbuilt functions.

In my problem, one column will have different currencies, I need to convert them to dollar. How can I detect and convert them to dollar using pandas inbuilt functions ?

My column as following:

100Dollar
200Dollar
100Euro
300Euro
184pounds
150pounds
10rupee
30rupee

Note: amount and currency name is in same column.

Note: conversion exchange rate w.r.t dollar {Euro: 1.2, pounds: 1.3, rupee: 0.05}

Note: currency enum is ['Euro', 'Dollar', 'Pounds', 'Rupee']


Solution

  • Use Series.str.extract with regular expressions to extra the correct values into a new column. Then map the exchange_rate to the Currency column to calculate the Amount dollars:

    df[['Amount', 'Currency']] = df['column'].str.extract(r'(\d+)(\D+)')
    
    exchange_rate = {'Euro': 1.2, 'pounds': 1.3, 'rupee': 0.05}
    df['Amount_dollar'] = pd.to_numeric(df['Amount']) * df['Currency'].map(exchange_rate).fillna(1) 
    
          column  Amount Currency  Amount_dollar
    0  100Dollar     100   Dollar         100.00
    1  200Dollar     200   Dollar         200.00
    2    100Euro     100     Euro         120.00
    3    300Euro     300     Euro         360.00
    4  184pounds     184   pounds         239.20
    5  150pounds     150   pounds         195.00
    6    10rupee      10    rupee           0.50
    7    30rupee      30    rupee           1.50