Search code examples
pythonpython-3.xpandastimeduration

Converting years and months in python


I am working with time column like below:

import pandas as pd
 
# initialise data of lists.
data = {'Name':['Tom', 'nick', 'krish', 'jack'], 
        'Time':['3 years 4 months in role 20 years in company', '1 year 7 months in role 2 years 2 months in company',
                '2 years 1 month in role 7 years 2 months in company', '3 months in role 8 years 2 months in company']}
 
# Create DataFrame
df = pd.DataFrame(data)
 
# Print the output.
df

I need to split the time column into two columns time_in_role(months) and time_in_company(months). And calculate the appropriate time in months.

Sample input:

enter image description here

Output expected:

enter image description here

Any help is appreciated.


Solution

  • You can use regular expressions to get years/months from the string. Afterwards it's just simple math:

    role = (
        df["Time"]
        .str.extract(r"(?:(?P<y>\d+) years?)?\s*(?:(?P<m>\d+) months?)? in role")
        .astype(float)
        .fillna(0)
    )
    
    company = (
        df["Time"]
        .str.extract(r"(?:(?P<y>\d+) years?)?\s*(?:(?P<m>\d+) months?)? in company")
        .astype(float)
        .fillna(0)
    )
    
    df["Time_in_Role(months)"] = (role["y"] * 12 + role["m"]).astype(int)
    df["Time_in_Company(months)"] = (company["y"] * 12 + company["m"]).astype(int)
    print(df)
    

    Prints:

        Name                                                 Time  Time_in_Role(months)  Time_in_Company(months)
    0    Tom         3 years 4 months in role 20 years in company                    40                      240
    1   nick  1 year 7 months in role 2 years 2 months in company                    19                       26
    2  krish  2 years 1 month in role 7 years 2 months in company                    25                       86
    3   jack         3 months in role 8 years 2 months in company                     3                       98