Search code examples
pythonpandasdatediff

pandas: function equivalent to SQL's datediff()?


Is there an equivalent to SQL's datediff function in Python's pandas? The answer to this question: Add column with number of days between dates in DataFrame pandas explains how to calculate the difference in days. For example:

>>> (pd.to_datetime('15-10-2010') - pd.to_datetime('15-07-2010')) / pd.offsets.Day(1)
92.0

However, I have two questions:

  1. Is there a way to calculate the difference in months? I can approximate dividing the result above by 30, by 31, but I was wondering if there is some built-in function that does this automatically.
  2. what is the syntax of pd.offsets? I tried dividing by pd.offsets.Month(1) and it doesn't work. I looked up the documentation here (which is atrocious, like all of Python's documentation!): http://pandas.pydata.org/pandas-docs/stable/timeseries.html#dateoffset-objects but Day() is not there, so I'm confused

Solution

  • UPDATE:

    def months_between(d1, d2):
        dd1 = min(d1, d2)
        dd2 = max(d1, d2)
        return (dd2.year - dd1.year)*12 + dd2.month - dd1.month
    
    In [125]: months_between(pd.to_datetime('2015-01-02 12:13:14'), pd.to_datetime('2012-03-02 12:13:14'))
    Out[125]: 34
    

    OLD answer:

    In [40]: (pd.to_datetime('15-10-2010') - pd.to_datetime('15-07-2010')).days
    Out[40]: 92
    

    you can also do this for months:

    In [48]: pd.to_datetime('15-10-2010').month - pd.to_datetime('15-07-2010').month
    Out[48]: 3