Search code examples
pythonpython-itertools

Is 'dropwhile' static? can I make it dynamic?


Introduction

I came up with a cunning solution to my problem but, not-so-cunning, it doesn't work :-/

After hours of clicking through the debugger I think, so perhaps you can verify this, that the reason it doesn't work is because itertools.dropwhile, after the initial declaration, is fixed - whereas I was hoping I could alter the input parameters to the predicate on every loop.

The task below attempts to pick a startdate, then an enddate that follows it, then a startdate that follows that, and an enddate after the last... and so on, so that we end up with a series of date intervals which don't overlap. The startdates come from one list, and the enddates from another.

The following solution loops over the startdates, then over the enddates, using dropwhile to pass over the dates which are in 'the past'. It works perfectly on the first time through. But on the second pass the enddate gets stuck at '2009-12-14'. I broke out "isbefore" routine just so I could see when it gets tested, and when it doesn't. Not sure, but I think what is happening is that the whole dropwhile test gets set in stone on the first pass, and doesn't recompile itself on each pass? Like I'd hoped/expected.

To finish it off I was hoping to wrap it all inside a while True and exit via a StopIteration exception, thus extracting the full sequence of intervals. But it never fires. When I tried neither iterator would actually 'next' all the way to its end.

Questions

  1. Is that (the set-in-stone conclusion) correct as to what's happening?
  2. Is there an short, clean, elegant way to make it behave the way I'd hoped? Do I have to write my own dropwhile which acts the way I want?

Code

import itertools
import datetime

startdates = [
    datetime.date(2009, 11, 5), datetime.date(2009, 11, 13),
    datetime.date(2009, 12, 4), datetime.date(2009, 12, 7),
    datetime.date(2009, 12, 29), datetime.date(2009, 12, 30)]

enddates = [
    datetime.date(2009, 10, 1), datetime.date(2009, 10, 2),
    datetime.date(2009, 11, 4), datetime.date(2009, 12, 14),
    datetime.date(2009, 12, 15),datetime.date(2009, 12, 30)]

enddate = datetime.date(1900, 1, 1)
startdate = datetime.date(1900, 1, 1)

def isbefore(a, b):
    return a <= b

for startdate in itertools.dropwhile(lambda date: isbefore(date, enddate), startdates):
    for enddate in itertools.dropwhile(lambda date: isbefore(date, startdate), enddates):
        print startdate, enddate
        break

Current Output

2009-11-05 2009-12-14
2009-11-13 2009-12-14
2009-12-04 2009-12-14
2009-12-07 2009-12-14

Desired Output

2009-11-05 2009-12-14
2009-12-29 2009-12-30

The more observant types will notice I asked this question yesterday which presents the same problem, but that one calls for a general solution whereas this time I'm asking specifically about the workings of dropwhile.


Solution

  • Once dropwhile has dropped the items which satisfy the predicate, it yields the rest of the iterable unmodified. Using itertools.ifilterfalse in its place gives your desired output, as it will continue to process the predicate even after the initial run.