Search code examples
pythonpandasstringtimedeltadtype

Python/Pandas Add string to rows in a column that contain a character a specific number of times


I have a Pandas DataFrame(data) with a ['Duration'] column as 'object' type that has time durations in format: 'H:%M:%S' such as '1:47:54' with 7 characters, but also time durations that only have minutes '%M:%S' such as '45:20' or '03:20' with 5 characters.

I want to convert this column to Pandas Time Delta.

However, it seems that I need to have all of my rows in the column with the same length of characters because when I try to do:

data['Duration'] = pd.to_timedelta(data['Duration'])

I get: "ValueError: expected hh:mm:ss format"

Do you have any idea how to solve this problem?

Is it possible to search for the rows in a column that contain ':' only once?

And then add to these rows a string with '0:' ?

Maybe with: data['Duration'] = '0:' + data['Duration'].astype(str)


Solution

  • You can use df.where:

    df['Timedelta'] = pd.to_timedelta(df['Duration'].where(df['Duration'].str.len() >= 7,
                                                           other='0:' + df['Duration']))
    print(df)
    
    # Output
      Duration       Timedelta
    0  1:47:54 0 days 01:47:54
    1    45:20 0 days 00:45:20
    2    03:20 0 days 00:03:20