Search code examples
pythonspawneventlet

spawn of Eventlet does not work. So strange


**** hi,all

i am using eventlet to implement a web crawler. My code is just like this

import eventlet


urls = [
    "http://deeplearning.stanford.edu/wiki/index.php/UFLDL%E6%95%99%E7%A8%8B",
    "http://www.google.com.hk",
    "http://www.baidu.com",
]



def fetch(url):
    print('entering fetch')
    body=urllib2.urlopen(url).read()
    print('body')


pool = eventlet.GreenPool(100)
for url in urls:
    pool.spawn(fetch,url)
    time.sleep(10)

but it outputs nothing and it seems that fetch does not run at all

BTW,pool.imap does works

what happened?

what i want to do is that:the urls are coming streamly ,i.e. one by one. just like this

While(True):
   url=getOneUrl() #get one url streamly
   pool.spawn(fetch,url) #fetch the url

but it does not work,either.

thanks in advance....


Solution

  • According to the eventlet implementation, the pool.imap code will wait until all greenthreads in the pool finish working, but pool.spawn won't and ends immediately.

    You can try appending some waiting or sleeping at the end of you script. Then those spawned greenthreads will executed your function.

    pool.waitall()
    

    or

    eventlet.sleep(10)
    

    Actually, in 'for body in pool.imap(fetch, urls)', it calls pool.imap and iterate the results. The invoke of pool.imap doesn't call waiting functions, but the iterating does.

    Try doing it without iteration of the result. Without iteration, it ends immediately as pool.spawn.

    pool = eventlet.GreenPool(100)
    pool.imap(fetch, urls)
    

    If you want to know more about this, just check the code in greenpool.py.


    There is only one thread running for all green threads. Try this on all green threads, you will get a unique thread id.

    print greenthread.getcurrent(), threading.current_thread()
    

    If looping without eventlet.sleep, the thread is blocked all the time. Other green threads have no chance to be scheduled. So one possible solution for your problem is calling eventlet.sleep after invoking spawn in your while loop.