Search code examples
pythontimenumeric

How to change this time data into H:M in python


Have a dataset with a duration column with time data listed as an object shown below

df['duration'].head(10)
    0          60 min.
    1    1 hr. 13 min.
    2    1 hr. 10 min.
    3          52 min.
    4    1 hr. 25 min.
    5          45 min.
    6          45 min.
    7          60 min.
    8          45 min.
    9          45 min.
    Name: duration, dtype: object

How do I change this to an appropriate numerical value, like below?

0    00:60
1    01:13
2    01:10
3    00:52
4    01:25
5    00:45

Solution

  • Here is a way to get a string version in %H:%M format and a timedelta version:

    import pandas as pd
    df = pd.DataFrame({'duration':['60 min.', '1 hr. 13 min.', '1 hr. 10 min.']})
    print(df)
    
    df['parts']=df.duration.str.findall('\d+')
    df['timedelta']=df.parts.apply(lambda x: pd.to_timedelta((0 if len(x) < 2 else int(x[0])) * 3600 + int(x[-1])*60, unit='s'))
    df['hours and minutes']=df.parts.apply(lambda x: f"{0 if len(x) < 2 else int(x[0]):02}:{int(x[-1]):02}")
    df = df.drop(columns=['duration', 'parts'])
    print(df)
    

    Input:

            duration
    0        60 min.
    1  1 hr. 13 min.
    2  1 hr. 10 min.
    

    Output:

            timedelta hours and minutes
    0 0 days 01:00:00             00:60
    1 0 days 01:13:00             01:13
    2 0 days 01:10:00             01:10
    

    If we do this:

    print(df.timedelta.dtypes)
    

    ... we see that the timedelta column indeed contains numerical values (of timedelta data type):

    timedelta64[ns]