Search code examples
pythonlinuxsocketsnetworkingthroughput

Throughput Measurements of Custom Network device using Linux and Python3


I have a custom network device that has 2 ethernet interfaces. It can receive UDP messages on either interface as well as a custom experimental protocol as well on both interfaces. Once it receives a message, it processes them, and forwards the cmd/results to the other interface (It is like a firewall in my case, but I want to keep the discussion to general networked devices).

I wanted to measure the throughput and latency of the device so I hooked up two ethernet cables from the device to a linux RHEL 8 system. I created a python 3 application running on the linux system that has 3 processes 1) 1 to send data on one interface, 2) 1 to receive the processed data on the other interface, and 3) 1 to send dummy data in the opposite direction so that the measurements can be taken under load of data going in both directions

I am using the python multiprocessing library and using linux udp socket's from the socket library as well as raw sockets for the custom protocol (iperf 3 does not work in my case due to the custom protocol). How I am going about measuring the throughput and latency is creating a bunch of test packets in bulk and timestamping each one after it gets sent out from process 1). While process 1) is sending the packets, process 2) is receiving and timestamping the packets on the other side. After it has received all of the bulk data, it calculates a diff between adjacent received packets to get a throughput measurement and it calculates a diff between send and receive to get latency measurements.

The problem is related to the multiprocessing/socket interplay. On the machine, when I call sock.sendto(data, (dest, port)) from process 1 and I call sock2.receivefrom(data) from process 2, they do not seem to be running concurrently. It seems that process 1) sends all of its packets (I am trying this with packets of size 1000 bytes, 300 packets) and then process 2) begins receiving. I know from wireshark traces that the received packets actually reach the linux system way before it begins processing them. Even stranger process 3) sends all of its packets before process 2) even sends a single packet. To make matters worse, process 1) sometimes misses packets if the amount of total packet bytes exceeds 200000.

The below is some psuedo code of what I am doing

import multiprocessing
import time
import socket

def sender(socket, queue, data_list, dest):
    timestamps = []
    for data in data_list:
        socket.sendto(data, dest)
        timestamps.append(time.perf_counter())
    queue.put(timestamps)

def sender_dummy(socket, queue, data_list, dest):
    for data in data_list:
        socket.sendto(data, dest)

def main():
    ## Setup the sockets with appropriate timeouts and interfaces and protocols
    ....
    ## Setup the test data
    ....
    ## Begin multi-processing
    q = multiprocessing.Queue()
    task_main = multiprocessing.Process(target=sender, args=(sender_sock, q, test_data, (dest_addr, dest_port)))
    task_dummy = multiprocessing.Process(target=sender_dummy, args=(sender_dummy_sock, q, dummy_data, (dest_addr, dest_port)))


    timestamps_recv = []
    task_dummy.start()
    task_main.start()

    for i in range(len(test_data)):
        recv_sock.receivefrom(TEST_DATA_SIZE)
        timestamps_recv.append(time.perf_counter())

    task_dummy.join()
    task_main.join()
    timestamps_send = q.get()

    # Calculate latency and throughput
    ...

Any help on getting the packet drop issue resolved and getting true concurrency of the senders and the receiver is appreciated.


Solution

  • Here are a couple of things you can try:

    1. Ensure sockets are not shared across processes. Sharing can lead to unpredictable behaviour. Create individual socket instances within each process after it has been forked.

    2. Processes might be blocking each other due to synch issues. Try to use separate queues for each sender process. Also, ensure the receiver process actively listens as soon as it starts, rather than starting it synchronously after senders.

    3. Try adjusting the socket buffer sizes using socket.setsockopt()

    4. Cross-check the timestamps against external measures (could be Wireshark) to ensure they are accurately representing the network events

    5. GIL might be affecting performance. Each process runs with its own GIL, which results in overhead when handling I/O operations.

    I would consider the following general changes:

    Use threading for I/O-bound tasks because threads share the same memory space and are lighter weight compared to processes.

    Look into asyncio library, which is excellent for handling I/O operations and can provide a more efficient way to manage concurrent operations.

    Here's an adjustment for async start of receiving:

    def receiver(recv_sock):
        while True:
            data, addr = recv_sock.recvfrom(TEST_DATA_SIZE)
            if not data:
                break
            timestamps_recv.append(time.perf_counter())
    
    task_recv = multiprocessing.Process(target=receiver, args=(recv_sock,))
    task_recv.start()
    

    Hope this helps.