Search code examples
pythonsocketsasynchronousnonblocking

How can I get non-blocking socket connect()'s?


I have a quite simple problem here. I need to communicate with a lot of hosts simultaneously, but I do not really need any synchronization because each request is pretty self sufficient.

Because of that, I chose to work with asynchronous sockets, rather than spamming threads. Now I do have a little problem:

The async stuff works like a charm, but when I connect to 100 hosts, and I get 100 timeouts (timeout = 10 secs) then I wait 1000 seconds, just to find out all my connections failed.

Is there any way to also get non blocking socket connects? My socket is already set to nonBlocking, but calls to connect() are still blocking.

Reducing the timeout is not an acceptable solution.

I am doing this in Python, but I guess the programming language doesnt really matter in this case.

Do I really need to use threads?


Solution

  • You need to parallelize the connects as well, since the sockets block when you set a timeout. Alternatively, you could not set a timeout, and use the select module.

    You can do this with the dispatcher class in the asyncore module. Take a look at the basic http client example. Multiple instances of that class won't block each other on connect. You can do this just as easily using threads, and I think makes tracking socket timeouts easier, but since you're already using asynchronous methods you might as well stay on the same track.

    As an example, the following works on all my linux systems

    import asyncore, socket
    
    class client(asyncore.dispatcher):
        def __init__(self, host):
            self.host = host
            asyncore.dispatcher.__init__(self)
            self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
            self.connect((host, 22))
    
        def handle_connect(self):
            print 'Connected to', self.host
    
        def handle_close(self):
            self.close()
    
        def handle_write(self):
            self.send('')
    
        def handle_read(self):
            print ' ', self.recv(1024)
    
    clients = []
    for i in range(50, 100):
        clients.append(client('cluster%d' % i))
    
    asyncore.loop()
    

    Where in cluster50 - cluster100, there are numerous machines that are unresponsive, or nonexistent. This immediately starts printing:

    Connected to cluster50
      SSH-2.0-OpenSSH_4.3
    
    Connected to cluster51
      SSH-2.0-OpenSSH_4.3
    
    Connected to cluster52
      SSH-2.0-OpenSSH_4.3
    
    Connected to cluster60
      SSH-2.0-OpenSSH_4.3
    
    Connected to cluster61
      SSH-2.0-OpenSSH_4.3
    
    ...
    

    This however does not take into account getaddrinfo, which has to block. If you're having issues resolving the dns queries, everything has to wait. You probably need to gather the dns queries separately on your own, and use the ip addresses in your async loop

    If you want a bigger toolkit than asyncore, take a look at Twisted Matrix. It's a bit heavy to get into, but it is the best network programming toolkit you can get for python.