Search code examples
c++network-programmingdpdkpcapplusplus

Load balancing in PCPP and DPDK


I’ve been trying to load balance the received packets across multiple physical rx queues (4 queues). I assigned a core to each queue in order to receive packets from it. After using DPDK’s implementation of getting statistics to check which queues got the packets, I see that only one queue is receiving all packets. I tried using RSS for load balancing, but in my case almost every header in each packet I receive is the same - the only difference is the payload. My questions are:

  1. Is there a way to use RSS and hash according to the payload’s data?
  2. Is there any way to do a simple round robin based packet distribution to each rx queue? For example if i get 16 packets they will be distributed evenly across all 4 queues.

(Edit) Since i didn’t provide any additional information about my use case (my fault) I’ll elaborate - To provide some additional context:

  • NIC Details: I am using an LX2160A board with the DPAA2 architecture. The maximum supported hash function representation in my case is 0xFFFF (according to dpdk’s api), indicating that I cannot use RSS_PORT (0x10000 representation) as a hash function for load balancing.
  • Packet Structure: The packets I am working with have mostly identical headers; the differences lie in the payload. This makes traditional RSS hash configurations (like L3/L4 hashing) ineffective for my use case, the only difference sometimes can be the destination port.
  • RSS Inputs: I’ve looked into the use of RTE_ETH_RSS_L2_PAYLOAD, but it’s unclear whether DPAA2 architecture supports using payload data for RSS input. I could not find specific documentation about extending RSS with payload hashing for this NIC.

Given these constraints:

  • Dynamic Firmware Profiles: Are there vendor-provided dynamic firmware profiles for the DPAA2 that might enable payload-based RSS or additional RSS configuration options? If so, where could I find the relevant documentation or bindings in DPDK?
  • IPv4 Checksum Hashing: My NIC supports 0xFFFF hash functions, but I am not sure whether RTE_ETH_RSS_IPV4_CHKSUM is usable in this setup. Has anyone had success with this on DPAA2 NICs?
  • Round-Robin Distribution: Since RSS alone doesn’t seem sufficient, does DPAA2 provide a hardware-based mechanism for round-robin packet distribution? If not, is there an efficient software-based way to achieve this within DPDK while maintaining low latency?

Finally, if tweaking the packet generator to randomize L4 source ports is a viable option, could you recommend tools or methods for achieving this effectively?

I’ll add that in my case I don’t care what destination IP every packet holds, maybe this can be a viable option to use for hashing?


Solution

  • For the NIC/platform in question, the list of supported RSS hash functions can be found in the driver code. RTE_ETH_RSS_IPV4_CHKSUM is not there. With regard to dynamic firmware profiles, the example in the original answer referred to a very specific NIC vendor, which is different from the one listed in the edit. And I'm afraid the question of whether the latter can support similar packet parsing extensions belongs in a customer care ticket. And so does the question about support for round-robin distribution. Though, it is a common observation that vendors deem this an impractical thing to support, — round-robin distribution of connection-based streams leads to packets of the same stream landing different queues, which is bad for performance as same-connection packets are meant to be processed by the same queue/core/worker in a typical use case.

    Other than that, the edited question remains rather vague on the packet structure, — which precise headers are present, which fields are constants and which can change and so on. And it is still unclear who generates the traffic in the first place and whether the OP keeps control of how the packets are constructed. If they do, then it is up to them (and not to some third-party tools) to directly alter the way packets are constructed to have, for example, L4 source port numbers randomised.