Search code examples
networkingdpdkmellanox

DPDK MLX5 driver - QP creation failure


I am developing a DPDK program using a Mellanox ConnectX-5 100G.

My program starts N workers (one per core), and each worker deals with its own dedicated TX and RX queue, therefore I need to setup N TX and N RX queues.

I am using flow director and rte_flow APIs to send ingress traffic to the different queues.

For each RX queue I create a mbuf pool with:

n = 262144
cache size = 512
priv_size = 0
data_room_size = RTE_MBUF_DEFAULT_BUF_SIZE

For N<=4 everything works fine, but with N=8, rte_eth_dev_start returns: Unknown error -12

and the following log message:

net_mlx5: port 0 Tx queue 0 QP creation failure
net_mlx5: port 0 Tx queue allocation failed: Cannot allocate memory

I tried:

  • to increment the number of Hugepages (up to 64x1G)
  • change the pool size in different ways
  • both DPDK 18.05 and 18.11
  • change the number of TX/RX descriptors from 32768 to 16384

but with no success.

You can see my port_init function here (for DPDK 18.11).

Thanks for your help!


Solution

  • The issue is related to the TX inlining feature of the MLX5 driver, which is only enabled when the number of queues is >=8. TX inlining uses DMA to send the packet directly to the host memory buffer.

    With TX inlining, there are some checks that fail in the underlying verbs library (which is called from DPDK during QP Creation) if a large number of descriptors is used. So a workaround is to use fewer descriptors.

    I was using 32768 descriptors, since the advertised value in dev_info.rx_desc_lim.nb_max is higher.

    The issue is solved using 1024 descriptors.