Search code examples
pythonmacosnetwork-programmingmultiprocessingudp

Multiprocess UDP server program in Python behaves differently on Linux and MacOS


I have a multiprocess UDP server in Python (see below code). It correctly spawns required amount of processes. On Linux, it load balances the UDP data between different processes properly but on MacOS, only one of the process seems to receive the data. Am I missing some parameters on MacOS here?

from multiprocessing import Pool
import socket
import os

def s(PORT):
    UDP_IP = ""
    UDP_PORT = PORT

    sock = socket.socket(socket.AF_INET, # Internet
                        socket.SOCK_DGRAM) # UDP
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
    sock.bind((UDP_IP, UDP_PORT))
    print(f"Started server {os.getpid()}")

    while True:
        data, addr = sock.recvfrom(1024)
        print(f"Received message on {os.getpid()}: {data}")

if __name__ == '__main__':
    print(f"Program started {os.getpid()}")
    with Pool(5) as p:
        p.map(s, [8000] * 5)
(Output MacOS)

Program started 32205
Started server 32233
Started server 32235
Started server 32234
Started server 32232
Started server 32236
Received message on 32233: b'hello 1'
Received message on 32233: b'hello 2'
Received message on 32233: b'hello 3'
Received message on 32233: b'hello 4'
Received message on 32233: b'hello 5'
Received message on 32233: b'hello 6'
Received message on 32233: b'hello 7'
Received message on 32233: b'hello 8'
Received message on 32233: b'hello 9'
Received message on 32233: b'hello 10'
(Output Linux)

Program started 1860163
Started server 1860165
Started server 1860164
Started server 1860166
Started server 1860167
Started server 1860168
Received message on 1860167: b'hello 1'
Received message on 1860165: b'hello 2'
Received message on 1860167: b'hello 3'
Received message on 1860167: b'hello 4'
Received message on 1860164: b'hello 5'
Received message on 1860168: b'hello 6'
Received message on 1860164: b'hello 7'
Received message on 1860167: b'hello 8'
Received message on 1860167: b'hello 9'
Received message on 1860164: b'hello 10'

I have tried SO_REUSEADDR as well but it only starts one process. I guess other processes are waiting for bind to complete but first process has already got a hold of it.

I also tried daemonizing with gunicorn but the same problem of only one worker receiving data exists there.

Here is a simpler version of the program with gunicorn.

# run with 'gunicorn udp_server --reuse-port -w 5'
import socket
import os

UDP_IP = ""
UDP_PORT = 8001

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1)
sock.bind((UDP_IP, UDP_PORT))
print(f"Process started {os.getpid()}")

while True:
    data, addr = sock.recvfrom(1024)
    print(f"Received message ({os.getpid()}): {data}")

Extra: to send udp data

for i in `seq 100`; do echo -n "hello $i" | nc -u -w 1 127.0.0.1 8000; done

Solution

  • Don't create 5 different sockets. Create one socket in the parent process, it will be inherited by the children.

    from multiprocessing import Pool
    import socket
    import os
    
    def s(sock):
        while True:
            data, addr = sock.recvfrom(1024)
            print(f"Received message on {os.getpid()}: {data}")
    
    if __name__ == '__main__':
        print(f"Program started {os.getpid()}")
        UDP_IP = ""
        UDP_PORT = 8000
    
        sock = socket.socket(socket.AF_INET, # Internet
                            socket.SOCK_DGRAM) # UDP
        sock.bind((UDP_IP, UDP_PORT))
        print(f"Started server {os.getpid()}")
        with Pool(5) as p:
            p.map(s, [sock] * 5)