Search code examples
pythonconcurrent.futures

using futures.concurrent with a class


I have a main that looks like the following:

import gather_filings_per_filer
import os
from concurrent import futures

def put_last_i(arg):
    with open(os.path.dirname(os.path.abspath(__file__)) + '/last_i.txt','w') as f:
            f.write(arg)

def get_last_i():
    with open(os.path.dirname(os.path.abspath(__file__)) + '/last_i.txt','r') as f:
            data = f.readlines()
        return data[0]

if __name__=='__main__':
    i = int(get_last_i())
    g_f_p_f = gather_filings_per_filer.gather_filings_per()
    jobs=[]


    with futures.ThreadPoolExecutor(max_workers = 3) as executor:
        my_d = dict((executor.submit(g_f_p_f.mt_main),j) for j in range(i,297085))

        for future in futures.as_completed(my_d):
            print future
            print future.exception()

And g_f_p_f.mt_main looks like the following:

class gather_filings_per:
       def mt_main(self, i):
            print "Started:",i
            self.w_m = wrap_mysql.wrap_mysql()
            flag = True
            #do stuff...

Gives me the following result:

<Future at 0x5dc2710 state=finished raised TypeError>
mt_main() takes exactly 2 arguments (1 given)
<Future at 0x5dc25b0 state=finished raised TypeError>
mt_main() takes exactly 2 arguments (1 given)

From my perspective mt_main takes only 1 argument (self should not be required given the typical self behavior).

What seems to be going wrong here that I am missing?


Solution

  • You're correct, you only need to provide one additional argument, beyond the implicit self. But you didn't give any. So you're one short. Splitting up the submit to make it visually clear:

    my_d = dict(  # Making a dict
                # With key as result object from invoking `mt_main` with no additional arguments
                (executor.submit(g_f_p_f.mt_main),
                # And value j
                 j)
                for j in range(i,297085))
    

    Perhaps you meant to pass j as an argument? Assuming it should also be the value, that would be:

    # Note second argument to submit, which becomes only non-self argument to mt_main
    my_d = dict((executor.submit(g_f_p_f.mt_main, j),j) for j in range(i,297085))
    

    Or to simplify a titch, because having concurrent.futures should mean you can use dict comprehensions (which also separates the submit call from it's value with : for better visual parsing):

    my_d = {executor.submit(g_f_p_f.mt_main, j): j for j in range(i, 297085)}