Search code examples
pythonpandasdatetimepython-datetimetimedelta

Calculate Time Difference Between Two Pandas Columns in Hours and Minutes


I have two columns, fromdate and todate, in a dataframe.

import pandas as pd

data = {'todate': [pd.Timestamp('2014-01-24 13:03:12.050000'), pd.Timestamp('2014-01-27 11:57:18.240000'), pd.Timestamp('2014-01-23 10:07:47.660000')],
        'fromdate': [pd.Timestamp('2014-01-26 23:41:21.870000'), pd.Timestamp('2014-01-27 15:38:22.540000'), pd.Timestamp('2014-01-23 18:50:41.420000')]}

df = pd.DataFrame(data)

I add a new column, diff, to find the difference between the two dates using

df['diff'] = df['fromdate'] - df['todate']

I get the diff column, but it contains days, when there's more than 24 hours.

                   todate                 fromdate                    diff
0 2014-01-24 13:03:12.050  2014-01-26 23:41:21.870  2 days 10:38:09.820000
1 2014-01-27 11:57:18.240  2014-01-27 15:38:22.540  0 days 03:41:04.300000
2 2014-01-23 10:07:47.660  2014-01-23 18:50:41.420  0 days 08:42:53.760000

How do I convert my results to only hours and minutes (i.e. days are converted to hours)?


Solution

  • Pandas timestamp differences returns a datetime.timedelta object. This can easily be converted into hours by using the *as_type* method, like so

    import pandas
    df = pandas.DataFrame(columns=['to','fr','ans'])
    df.to = [pandas.Timestamp('2014-01-24 13:03:12.050000'), pandas.Timestamp('2014-01-27 11:57:18.240000'), pandas.Timestamp('2014-01-23 10:07:47.660000')]
    df.fr = [pandas.Timestamp('2014-01-26 23:41:21.870000'), pandas.Timestamp('2014-01-27 15:38:22.540000'), pandas.Timestamp('2014-01-23 18:50:41.420000')]
    (df.fr-df.to).astype('timedelta64[h]')
    

    to yield,

    0    58
    1     3
    2     8
    dtype: float64