Search code examples
linuxopenstackdpdknfv

DPDK for general purpose workload


I have deployed OpenStack and configured OVS-DPDK on compute nodes for high-performance networking. My workload is a general-purpose workload like running haproxy, mysql, apache, and XMPP etc.

When I did load-testing, I found performance is average and after 200kpps packet rate I noticed packet drops. I heard and read DPDK can handle millions of packets but in my case, it's not true. In guest, I am using virtio-net which processes packets in the kernel so I believe my bottleneck is my guest VM.

I don't have any guest-based DPDK application like testpmd etc. Does that mean OVS+DPDK isn't useful for my cloud? How do I take advantage of OVS+DPDK with a general-purpose workload?

Updates

We have our own loadtesting tool which generate Audio RTP traffic which is pure UDP based 150bytes packets and noticed after 200kpps audio quality go down and choppy. In short DPDK host hit high PMD cpu usage and loadtest showing bad audio quality. when i do same test with SRIOV based VM then performance is really really good.

$ ovs-vswitchd -V
ovs-vswitchd (Open vSwitch) 2.13.3
DPDK 19.11.7

Intel NIC X550T

# ethtool -i ext0
driver: ixgbe
version: 5.1.0-k
firmware-version: 0x80000d63, 18.8.9
expansion-rom-version:
bus-info: 0000:3b:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

In the following output what does these queue-id:0 to 8 and why only the first queue is in use but not others, they are always zero. What does this mean?

ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 2:
  isolated : false
  port: vhu1c3bf17a-01    queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  1 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  2 (disabled)  pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  3 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 3:
  isolated : false
pmd thread numa_id 0 core_id 22:
  isolated : false
  port: vhu1c3bf17a-01    queue-id:  3 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  6 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  0 (enabled)   pmd usage: 54 %
  port: vhu6b7daba9-1a    queue-id:  5 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  3 %
pmd thread numa_id 0 core_id 26:
  isolated : false
  port: vhu1c3bf17a-01    queue-id:  2 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  7 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  1 (disabled)  pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  4 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 27:
  isolated : false
pmd thread numa_id 0 core_id 46:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage:  27 %
  port: vhu1c3bf17a-01    queue-id:  4 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  5 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  6 (disabled)  pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  7 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 47:
  isolated : false


$ ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl
dpif-netdev/pmd-stats-show | grep "processing cycles:"
  processing cycles: 1697952 (0.01%)
  processing cycles: 12726856558 (74.96%)
  processing cycles: 4259431602 (19.40%)
  processing cycles: 512666 (0.00%)
  processing cycles: 6324848608 (37.81%)

Does processing cycles mean my PMD is under stress? but i am only hitting 200kpps rate?

This is my dpdk0 and dpdk1 port statistics

sudo ovs-vsctl get Interface dpdk0 statistics
{flow_director_filter_add_errors=153605,
flow_director_filter_remove_errors=30829, mac_local_errors=0,
mac_remote_errors=0, ovs_rx_qos_drops=0, ovs_tx_failure_drops=0,
ovs_tx_invalid_hwol_drops=0, ovs_tx_mtu_exceeded_drops=0,
ovs_tx_qos_drops=0, rx_128_to_255_packets=64338613,
rx_1_to_64_packets=367, rx_256_to_511_packets=116298,
rx_512_to_1023_packets=31264, rx_65_to_127_packets=6990079,
rx_broadcast_packets=0, rx_bytes=12124930385, rx_crc_errors=0,
rx_dropped=0, rx_errors=12, rx_fcoe_crc_errors=0, rx_fcoe_dropped=12,
rx_fcoe_mbuf_allocation_errors=0, rx_fragment_errors=367,
rx_illegal_byte_errors=0, rx_jabber_errors=0, rx_length_errors=0,
rx_mac_short_packet_dropped=128, rx_management_dropped=35741,
rx_management_packets=31264, rx_mbuf_allocation_errors=0,
rx_missed_errors=0, rx_oversize_errors=0, rx_packets=71512362,
rx_priority0_dropped=0, rx_priority0_mbuf_allocation_errors=1096,
rx_priority1_dropped=0, rx_priority1_mbuf_allocation_errors=0,
rx_priority2_dropped=0, rx_priority2_mbuf_allocation_errors=0,
rx_priority3_dropped=0, rx_priority3_mbuf_allocation_errors=0,
rx_priority4_dropped=0, rx_priority4_mbuf_allocation_errors=0,
rx_priority5_dropped=0, rx_priority5_mbuf_allocation_errors=0,
rx_priority6_dropped=0, rx_priority6_mbuf_allocation_errors=0,
rx_priority7_dropped=0, rx_priority7_mbuf_allocation_errors=0,
rx_undersize_errors=6990079, tx_128_to_255_packets=64273778,
tx_1_to_64_packets=128, tx_256_to_511_packets=43670294,
tx_512_to_1023_packets=153605, tx_65_to_127_packets=881272,
tx_broadcast_packets=10, tx_bytes=25935295292, tx_dropped=0,
tx_errors=0, tx_management_packets=0, tx_multicast_packets=153,
tx_packets=109009906}

stats

sudo ovs-vsctl get Interface dpdk1 statistics
{flow_director_filter_add_errors=126793,
flow_director_filter_remove_errors=37969, mac_local_errors=0,
mac_remote_errors=0, ovs_rx_qos_drops=0, ovs_tx_failure_drops=0,
ovs_tx_invalid_hwol_drops=0, ovs_tx_mtu_exceeded_drops=0,
ovs_tx_qos_drops=0, rx_128_to_255_packets=64435459,
rx_1_to_64_packets=107843, rx_256_to_511_packets=230,
rx_512_to_1023_packets=13, rx_65_to_127_packets=7049788,
rx_broadcast_packets=199058, rx_bytes=12024342488, rx_crc_errors=0,
rx_dropped=0, rx_errors=11, rx_fcoe_crc_errors=0, rx_fcoe_dropped=11,
rx_fcoe_mbuf_allocation_errors=0, rx_fragment_errors=107843,
rx_illegal_byte_errors=0, rx_jabber_errors=0, rx_length_errors=0,
rx_mac_short_packet_dropped=1906, rx_management_dropped=0,
rx_management_packets=13, rx_mbuf_allocation_errors=0,
rx_missed_errors=0, rx_oversize_errors=0, rx_packets=71593333,
rx_priority0_dropped=0, rx_priority0_mbuf_allocation_errors=1131,
rx_priority1_dropped=0, rx_priority1_mbuf_allocation_errors=0,
rx_priority2_dropped=0, rx_priority2_mbuf_allocation_errors=0,
rx_priority3_dropped=0, rx_priority3_mbuf_allocation_errors=0,
rx_priority4_dropped=0, rx_priority4_mbuf_allocation_errors=0,
rx_priority5_dropped=0, rx_priority5_mbuf_allocation_errors=0,
rx_priority6_dropped=0, rx_priority6_mbuf_allocation_errors=0,
rx_priority7_dropped=0, rx_priority7_mbuf_allocation_errors=0,
rx_undersize_errors=7049788, tx_128_to_255_packets=102664472,
tx_1_to_64_packets=1906, tx_256_to_511_packets=68008814,
tx_512_to_1023_packets=126793, tx_65_to_127_packets=1412435,
tx_broadcast_packets=1464, tx_bytes=40693963125, tx_dropped=0,
tx_errors=0, tx_management_packets=199058, tx_multicast_packets=146,
tx_packets=172252389}

Update - 2

dpdk interface

  # dpdk-devbind.py -s
    
    Network devices using DPDK-compatible driver
    ============================================
    0000:3b:00.1 'Ethernet Controller 10G X550T 1563' drv=vfio-pci unused=ixgbe
    0000:af:00.1 'Ethernet Controller 10G X550T 1563' drv=vfio-pci unused=ixgbe
    
    Network devices using kernel driver
    ===================================
    0000:04:00.0 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if=eno1 drv=tg3 unused=vfio-pci
    0000:04:00.1 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if=eno2 drv=tg3 unused=vfio-pci
    0000:3b:00.0 'Ethernet Controller 10G X550T 1563' if=int0 drv=ixgbe unused=vfio-pci
    0000:af:00.0 'Ethernet Controller 10G X550T 1563' if=int1 drv=ixgbe unused=vfio-pci

OVS

# ovs-vsctl show
595103ef-55a1-4f71-b299-a14942965e75
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-tun
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port br-tun
            Interface br-tun
                type: internal
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port vxlan-0a48042b
            Interface vxlan-0a48042b
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.43"}
        Port vxlan-0a480429
            Interface vxlan-0a480429
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.41"}
        Port vxlan-0a48041f
            Interface vxlan-0a48041f
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.31"}
        Port vxlan-0a48042a
            Interface vxlan-0a48042a
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.42"}
    Bridge br-vlan
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port br-vlan
            Interface br-vlan
                type: internal
        Port dpdkbond
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:af:00.1", n_txq_desc="2048"}
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:3b:00.1", n_txq_desc="2048"}
        Port phy-br-vlan
            Interface phy-br-vlan
                type: patch
                options: {peer=int-br-vlan}
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port vhu87cf49d2-5b
            tag: 7
            Interface vhu87cf49d2-5b
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhu87cf49d2-5b"}
        Port vhub607c1fa-ec
            tag: 7
            Interface vhub607c1fa-ec
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhub607c1fa-ec"}
        Port vhu9a035444-83
            tag: 8
            Interface vhu9a035444-83
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhu9a035444-83"}
        Port br-int
            Interface br-int
                type: internal
        Port int-br-vlan
            Interface int-br-vlan
                type: patch
                options: {peer=phy-br-vlan}
        Port vhue00471df-d8
            tag: 8
            Interface vhue00471df-d8
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhue00471df-d8"}
        Port vhu683fdd35-91
            tag: 7
            Interface vhu683fdd35-91
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhu683fdd35-91"}
        Port vhuf04fb2ec-ec
            tag: 8
            Interface vhuf04fb2ec-ec
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhuf04fb2ec-ec"}
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
    ovs_version: "2.13.3"

I have created guest vms using openstack and they can see them they are connected using vhost socket (Ex: /var/lib/vhost_socket/vhuf04fb2ec-ec)


Solution

  • When I did load-testing, I found performance is average and after 200kpps packet rate I noticed packet drops. In short DPDK host hit high PMD cpu usage and loadtest showing bad audio quality. when i do same test with SRI

    [Answer] this observation is not true based on the live debug done so far. The reason as stated below

    1. qemu launched were not pinned to specific cores.
    2. comparison done against PCIe pass-through (VF) against vhost-client is not apples to apples comparison.
    3. with OpenStack approach, there are at least 3 bridges before the packets to flow through before reaching VM.
    4. OVS threads were not pinned which led to all the PMD threads running on the same core (causing latency and drops) in each bridge stage.

    To have a fair comparison against SRIOV approach, the following changes have been made with respect to similar question

      External Port <==> DPDK Port0 (L2fwd) DPDK net_vhost <--> QEMU (virtio-pci)
    

    Numbers achieved with iperf3 (bidirectional) is around 10Gbps.

    Note: requested to run trex, pktgen to try out Mpps. Expectation is to reach minimum 8 MPPS with the current setup.

    Hence this is not DPDK, virtio-client, qemu-kvm or SRIOV related issue, instead a configuration or platform setup issue.