Search code examples
pythonasynchronoustwistedtwisted.webtwisted.internet

Understanding Twisted and async programming. Why is a code working and another one is not?


Sorry first the wall of text. I am struggling to understand Twisted and async programming in general.

I am using Python 2.7 with Twisted 15.4.0.

I tried this example for downloadPage() and it works perfectly.

I tinkered a bit with it, changing the callbacks from lambdas to proper functions. It worked. I also tried removing the reactor.stop() statement from both the callback and the errback and the only effect it has is that the script doesn't stop after downloading. This makes sence since the event loop is still running.

I also tried giving a broken URL.

If I have only one call to downloadPage(), the program blocks. It does not fire the errback.

If I have two calls, one with a broken URL and one with a correct URL, it runs, finishes, and fires the callback (I assume for the correct one) and terminates.

My first question is: Why does this happen? Why isn't the errback fired for a broken URL? Shouldn't a broken URL raise an error?


I have a separate code, that looks something like this:

def receive_some_data():
   # Do some non twisted stuff - The script runs and passes these lines
   While True:
       try:
           # Do some other non twisted stuff
           print "1"
           downloadPage("http//:www.google.com", "foo").addCallbacks(
               lambda value:(println('Good'),reactor.stop()),
               lambda error:(println("an error occurred",error),reactor.stop()))
           print "2"
       except Exceptions as e:
           print str(e)

def main():
    reactor.callWhenRunning(receive_some_data)
    reactor.run()

This code does not work. It prints "1", it prints "2", but there is no callback or errback calling. Nor is the page downloaded to "foo".

My second questions is: Why doesn't this code work? Is it because of the While loop? If so, how does the while loop affect the deferred and it's callback chain?


Edit 1: I changed the "While True" to a "While condition" that terminates after 3 iterations. Now the files are downloaded and the callbacks are called. Why is the infinite loop interfering with the download?

Also, my "# Do some other non twisted stuff" lines inside the While loop performs a read from a pipe. This is where I get my urls.

What is the best way I can read my urls continuously and schedule a callback for the moment the download is finished?


Edit 2: I changed the code to something like this:

def receive_some_data():
    # Do some non twisted stuff
    if condition:
        # Request more urls
        downloadPage(url,file).addCallbacks(success,fail)
    else:
        # Ask sender not to send urls atm

def main():
    reactor.callWhenRunning(receive_some_data)
    reactor.run()

I changed the structure of the code to this thinking callWhenRunning() will keep calling the receive_some_data function ( like an infinite loop ). It doesn't.

How can I get the event loop to keep calling this function?


Edit 3: Managed to get this working somewhat. I found out about the Looping Call method. I call my code from Edit 2 every x seconds with Looping Call. It works. Are there any other methods?


Thanks!


Solution

  • Basically what you are doing when you are calling downloadPage multiple times is to start multiple download in parallel. The first one to finish will then be the first one to call it's callback or errorback and that will stop the reactor and with it all the other downloads.

    So when you change the URL of one call to a bad host, it will take some time for the request to time out. By then, the other (good) download will have finished, and stop the reactor. This is why your code works with one good URL only.

    Your while in the other example will create one download after the other and not return from the function receive_some_data. But it needs to return in order to allow the call/errorbacks of the downloads to be executed. Twisted runs only one call at a time.