The goal is to determine metrics of an UDP protocol performance, specifically:
This could and should be done without taking in account any slow software-caused issues(like 99% cpu usage by side process, inefficiently-written test program), or hardware (like busy channel, extremely long line, so on)
How should I go with estimating these best-possible parameters on a "real system"?
PS. I would offer a prototype, of what I call "a real system".
Consider 2 PCs, PC1 and PC2. They both are equipped with:
And you run a server test program on PC1, and a client - on PC2. After program runs, USB stick is mounted and results are dumped to file, and system powers down then. So, I've described some ideal situation. I can't imagine more "sterile" conditions for such an experiment..
For the PPS calculations take the total size of the frames and divide it into the Throughput of the medium.
For IPv4:
Ethernet Preamble and start of frame and the interframe gap 7 + 1 + 12 = 20 bytes.(not counted in the 64 byte minimum frame size)
Ethernet II Header and FCS(CRC) 14 + 4 = 18 bytes. IP Header 20 bytes. UDP Header 8 bytes.
Total overhead 46 bytes(padded to min 64 if payload is less than ) + 20 bytes "more on the wire"
Payload(Data)
1 byte payload - becomes 18 based on 64 byte minimum + wire overhead. Totaling 84 bytes on the wire.
64 byte - 48 + 64 = 112 + 20 for the wire overhead = 132 bytes.
If the throughput of the medium is 125000000 bytes per second(1 Gb/s).
1-18 bytes of payload = 1.25e8 / 84 = max theoretical 1,488,095 PPS.
64 bytes payload = 1.25e8 / 132 = max theoretical 946,969 PPS.
These calculations assume a constant stream: The network send buffers are filled constantly. This is not an issue given your modern hardware description. If this were 40/100 Gig Ethernet CPU, bus speeds and memory would all be factors.
Ping RTT time:
To calculate the time it takes to transfer data through a medium divide the data transferred by the speed of the medium.
This is harder since the ping data payload could be any size 64 - MTU(~1500 bytes). ping typically uses the min frame size (64 bytes total frame size + 20 bytes wire overhead * 2 = 168 bytes) Network time(0.001344 ms) + Process response and reply time combined estimated between 0.35 and 0.9 ms. This value depends on too many internal CPU and OS factors, L1-3 caching, branch predictions, ring transitions (0 to 3 and 3 to 0) required, TCP/IP stack implemented, CRC calculations, interrupts processed, network card drivers, DMA, validation of data(skipped by most implementations)...
Max time should be < 1.25 ms based on anecdotal evidence.(My best eval was 0.6ms on older hardware(I would expect a consistent average of 0.7 ms or less on the hardware as described)).
Jitter: The only inherent theoretical reason for network jitter is the asynchronous nature of transport which is resolved by the preamble. Max < (8 bytes)0.000512 ms. If sync is not established in this time the entire frame is lost. This is possibility that needs to be taken into account. Since UDP is best effort delivery.
As evidenced by the description of RTT: The possible variances in the CPU time in executing of identical code, as well as OS scheduling, and drivers makes this impossible to evaluate effectively.
If I had to estimate, I would design for a maximum of 1 ms jitter, with provisions for lost packets. It would be unwise to design a system intolerant of faults. Even for a "Perfect Scenario" as described faults will occur (a nearby lightening strike induces spurious voltages on the wire). UDP has no inherent method for tolerating lost packets.