Search code examples

Take a few elements from an iterable, do something, take a few elements more, and so on

Here is some python code moving data from one database in one server to another database in another server:

      SELECT d1.Doc_Id , d2.Doc_Id 
      FROM Document d1 
      INNER JOIN Reference r  ON d1.Doc_Id = r.Doc_Id 
      INNER JOIN Document d2 ON r.R9 = d2.T9

cursor2.execute("START TRANSACTION")
cursor2.executemany( "INSERT IGNORE INTO citation_t(citing_doc_id, cited_doc_id) VALUES (?,?)",
    cursor1 )

Now, for the sake of exposition, let's say that the transaction runs out of space in the target hard-drive before the commit, and thus the commit is lost. But I'm using the transaction for performance reasons, not for atomicity. So, I would like to fill the hard-drive with commited data so that it remains full and I can show it to my boss. Again, this is for the sake of exposition, the real question is below. In that scenario, I would rather do:

      SELECT d1.Doc_Id , d2.Doc_Id 
      FROM Document d1 
      INNER JOIN Reference r  ON d1.Doc_Id = r.Doc_Id 
      INNER JOIN Document d2 ON r.R9 = d2.T9


dark_spawn = some_dark_magic_with_iterable( cursor1, MAX_ELEMENTS_TO_MOVE_TOGETHER )

for partial_iterable in dark_spawn:
     cursor2.execute("START TRANSACTION")
     cursor2.executemany( "INSERT IGNORE INTO citation_t(citing_doc_id, cited_doc_id) VALUES (?,?)",
             partial_iterable )


My question is, which is the right way of filling in some_dark_magic_with_iterable, that is, to create some sort of iterator with pauses in-between?


  • Just create a generator! :P

    def some_dark_magic_with_iterable(curs, nelems):
        res = curs.fetchmany(nelems)
        while res:
            yield res
            res = curs.fetchmany(nelems)

    Ok, ok... for generic iterators...

    def some_dark_magic_with_iterable(iterable, nelems):
            while True:
                res = []
                while len(res) < nelems:
                yield res
        except StopIteration:
            if res:
                yield res