Search code examples
linuxpython-2.7socketspython-multithreadinglow-level

Python, Multithreading, sockets sometimes fail to create


Recently observed a rather odd behaviour that only happens in Linux but not freeBSD and was wondering whether anyone had an explanation or at least a guess of what might really be going on.

The problem:

The socket creation method, socket.socket(), sometimes fails. This only happens when multiple threads are creating the sockets, single-threaded works just fine.

To expand on socket.socket() fails, most of the time I get "error 13: Permission denied", but I have also seen "error 93: Protocol not supported".

Notes:

  1. I have tried this on Ubuntu 18.04 (bug is there) and freeBSD 12.0 (bug is not there)
  2. It only happens when multiple threads are creating sockets
  3. I've used UDP as a protocol for the sockets, although that seems to be more fault-tolerant. I have tried it with TCP as well, it even goes haywire faster with similar errors.
  4. It only happens sometimes, so multiple-runs might be required or as in the case I provided below - a bloated number of threads should also do the trick.

Code:

Here's some minimal code that you can use to reproduce that:


from threading import Thread
import socket

def foo():
    udp = socket.getprotobyname('udp')
    
    try:
        send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, udp)
    except Exception as e:
        print type(e)
        print repr(e)
    

def main():
    for _ in range(6000):
        t = Thread(target=foo)
        t.start()

main()

Note:

  1. I have used an artificially large number of threads just to maximize the probability that you'd hit that error at least once within a run with UDP. As I said earlier, if you try TCP you'll see A LOT of errors with that number of threads. But in reality, even a more real-world number of threads like 20 or even 10 would trigger the error, you'd just likely need multiple runs in order to observe it.
  2. Surrounding the socket creation with while, try/except will cause all subsequent calls to also fail.
  3. Surrounding the socket creation with try/except and in the "exception handing" bit restarting the function, i.e. calling it again would work and will not fail.

Any ideas, suggestions or explanations are welcome!!!

P.S.

Technically I know I can get around my problem by having a single thread create as many sockets as I need and pass them as arguments to my other threads, but that is not the point really. I am more interested in why this is happening and how to solve it, rather than what workarounds there might be, even though these are also welcome. :)


Solution

  • I managed to solve it. The problem comes from getprotobyname() not being thread safe!

    See: The Linux man page

    On another note, looking at the freeBSD man page also hints that this might cause problems with concurrency, however my experiments prove that it does not, maybe someone can follow up?

    Anyway, a fixed version of the code for anyone interested would be to get the protocol number in the main thread (seems sensible and should have done that in the first place) and then pass it as an argument. It would both reduce the system calls that you perform and fix any concurrency-related problems with that within the program. The code would look as follows:

    from threading import Thread
    import socket
    
    def foo(proto_num):
        try:
            send_socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, proto_num)
        except Exception as e:
            print type(e)
            print repr(e)
    
    
    def main():
        proto_num = socket.getprotobyname('udp')
        for _ in range(6000):
            t = Thread(target=foo, args=(proto_num,))
            t.start()
    
    main()
    
    

    Exceptions with socket creation in the form of "Permission denied" or "Protocol not supported" will not be reported this way. Also, note that if you use SOCK_DGRAM the proto_number is redundant and might be skipped altogether, however the solution would be more relevant in case someone wants to create a SOCK_RAW socket.