Search code examples
pythondatedatetimepython-dateutilrrule

dateutils rrule returns dates that 2 months apart


I am new to Python and also dateutil module. I am passing the following arguments:

disclosure_start_date = resultsDict['fd_disclosure_start_date']
disclosure_end_date = datetime.datetime.now()
disclosure_dates = [dt for dt in rrule(MONTHLY, dtstart=disclosure_start_date, until=disclosure_end_date)]

Here disclosure_start_date = 2012-10-31 00:00:00 which converted to datetime is datetime.datetime(2012, 10, 31, 0, 0)

End date is as of now.

When I use:

disclosure_dates = [dt for dt in rrule(MONTHLY, dtstart=disclosure_start_date, until=disclosure_end_date)]

I get the dates for every other month or 2 months apart. The result is:

>>> list(disclosure_dates)
[datetime.datetime(2012, 10, 31, 0, 0), 
 datetime.datetime(2012, 12, 31, 0, 0), 
 datetime.datetime(2013, 1, 31, 0, 0), 
 datetime.datetime(2013, 3, 31, 0, 0), 
 datetime.datetime(2013, 5, 31, 0, 0), 
 datetime.datetime(2013, 7, 31, 0, 0), 
 datetime.datetime(2013, 8, 31, 0, 0), 
 datetime.datetime(2013, 10, 31, 0, 0), 
 datetime.datetime(2013, 12, 31, 0, 0), 
 datetime.datetime(2014, 1, 31, 0, 0), 
 datetime.datetime(2014, 3, 31, 0, 0), 
 datetime.datetime(2014, 5, 31, 0, 0), 
 datetime.datetime(2014, 7, 31, 0, 0), 
 datetime.datetime(2014, 8, 31, 0, 0), 
 datetime.datetime(2014, 10, 31, 0, 0), 
 datetime.datetime(2014, 12, 31, 0, 0), 
 datetime.datetime(2015, 1, 31, 0, 0), 
 datetime.datetime(2015, 3, 31, 0, 0), 
 datetime.datetime(2015, 5, 31, 0, 0), 
 datetime.datetime(2015, 7, 31, 0, 0), 
 datetime.datetime(2015, 8, 31, 0, 0), 
 datetime.datetime(2015, 10, 31, 0, 0), 
 datetime.datetime(2015, 12, 31, 0, 0), 
 datetime.datetime(2016, 1, 31, 0, 0), 
 datetime.datetime(2016, 3, 31, 0, 0), 
 datetime.datetime(2016, 5, 31, 0, 0)]

I am not sure what I am doing wrong. Can someone please point out the mistake here?


Solution

  • The issue you are coming up against comes from the fact that datetime.datetime(2012, 10, 31, 0, 0) is the 31st of the month, and not all months have a 31st. Since the rrule module is an implementation of RFC 2445. Per RFC 3.3.10:

    Recurrence rules may generate recurrence instances with an invalid date (e.g., February 30) or nonexistent local time (e.g., 1:30 AM on a day where the local time is moved forward by an hour at 1:00 AM). Such recurrence instances MUST be ignored and MUST NOT be counted as part of the recurrence set.

    Since you have a monthly rule that generates the 31st of a month, it will skip all months with 30 or fewer days. You can see this bug report in dateutil about this issue.

    If you just want the last day of the month, you should use the bymonthday=-1 argument:

    from dateutil.rrule import rrule, MONTHLY
    from datetime import datetime
    
    disclosure_start_date = datetime(2012, 10, 31, 0, 0)
    
    rr = rrule(freq=MONTHLY, dtstart=disclosure_start_date, bymonthday=-1)
    # >>>rr.between(datetime(2013, 1, 1), datetime(2013, 5, 1))
    # [datetime.datetime(2013, 1, 31, 0, 0),
    #  datetime.datetime(2013, 2, 28, 0, 0),
    #  datetime.datetime(2013, 3, 31, 0, 0),
    #  datetime.datetime(2013, 4, 30, 0, 0)]
    

    Unfortunately, I don't think there's an RFC-compliant way to generate a simple RRULE that just falls back to the end of the month if-and-only-if it's necessary (e.g. what do you do with January 30th - you need fallback for February, but you don't want to use bymonthday=-2 because that will give you Feb. 27th, etc).

    Alternatively, for a simple monthly rule like this, a better option is probably to just use relativedelta, which does fall back to the end of the month:

    from dateutil.relativedelta import relativedelta
    from datetime import datetime
    
    def disclosure_dates(dtstart, rd, dtend=None):
        ii = 0
        while True:
            cdate = dtstart + ii*rd
            ii += 1
    
            yield cdate
            if dtend is not None and cdate >= dtend:
                break
    
    
    dtstart = datetime(2013, 1, 31, 0, 0)
    rd = relativedelta(months=1)
    rr = disclosure_dates(dtstart, rd, dtend=datetime(2013, 5, 1))
    
    # >>> list(rr)
    # [datetime.datetime(2013, 1, 31, 0, 0),
    #  datetime.datetime(2013, 2, 28, 0, 0),
    #  datetime.datetime(2013, 3, 31, 0, 0),
    #  datetime.datetime(2013, 4, 30, 0, 0),
    #  datetime.datetime(2013, 5, 31, 0, 0)]
    

    Note that I specifically used cdate = dtstart + ii * rd, you do not want to just keep a "running tally", as that will pin to the shortest month the tally has seen:

    dt_base = datetime(2013, 1, 31)
    dt = dt_base
    for ii in range(5):
        cdt = dt_base + ii*rd
        print('{} | {}'.format(dt, cdt))
        dt += rd
    

    Result:

    2013-01-31 00:00:00 | 2013-01-31 00:00:00
    2013-02-28 00:00:00 | 2013-02-28 00:00:00
    2013-03-28 00:00:00 | 2013-03-31 00:00:00
    2013-04-28 00:00:00 | 2013-04-30 00:00:00
    2013-05-28 00:00:00 | 2013-05-31 00:00:00