Search code examples
overlaykvmdpdkopenvswitchvirtual-network

Connect QEMU-KVM VMs using vhost-user-client and ovs-dpdk


My goal is to connect two QEMU-KVM VMs on an overlay network. Each VM is running on a separate physical host and must have a static IP on the network 10.0.0.0/24. To achieve this goal, I want to use an OVS bridge with DPDK. I want to use the vhost-user-client protocol to connect the OVS bridge with the VMs.

My physical setup is the following: two physical machines equipped with a Mellanox ConnectX6-DX, and connected back-to-back (no physical switch). What I want to achieve is this:

+------------------+            +------------------+
| HOST_1           |            | HOST_2           |
|                  |            |                  |
|  +------------+  |            |  +------------+  |
|  | VM_1       |  |            |  | VM_2       |  |
|  |            |  |            |  |            |  |
|  | +--------+ |  |            |  | +--------+ |  |
|  | | ens_2  | |  |            |  | | ens_2  | |  |
|  | |10.0.0.1| |  |            |  | |10.0.0.2| |  |
|  +-+---+----+-+  |            |  +-+---+----+-+  |
|        |         |            |        |         | 
|  vhost-client-1  |            |  vhost-client-1  |
|        |         |            |        |         |
|  +-----+------+  |            |  +-----+------+  |
|  |  bridge    |  |            |  |  bridge    |  |
|  |    br0     |  |            |  |    br0     |  |
|  |192.168.57.1|  |            |  |192.168.57.2|  |
|  +-----+------+  |            |  +-----+------+  |
|        |         |            |        |         |
|    +---+---      |            |    +---+---+     |
|    | dpdk0 |     |            |    | dpdk0 |     |
+----+---+--+------+            +----+---+---+-----+
         |                               |
         +-------------------------------+

I successfully created the OVS bridge (here, br0) and the DPDK port (here, dpdk0). On each physical machine, I am able to ping the bridge on the other machine. Then, I created a vhost-user-client port and attached it to the bridge. On each guest, I assigned a static IP according to the above picture, and the ens2 interface is up.

However, at this point I am not able to ping VM2 from VM1 or vice-versa. It seems like no traffic is exchanged through the vhost-client port at all. Ping fails with the Destination Host Unreachable message.

Some useful information:

ovs-vsctl show

Bridge br0
        datapath_type: netdev
        Port br0
            Interface br0
                type: internal
        Port dpdk0
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:01:00.0"}
        Port vhost-client-1
            Interface vhost-client-1
                type: dpdkvhostuserclient
                options: {vhost-server-path="/tmp/vhost-client-1"}
    ovs_version: "2.16.1"

ovs-vsctl -- --columns=name,ofport list Interface

name                : br0
ofport              : 65534

name                : dpdk0
ofport              : 6

name                : vhost-client-1
ofport              : 2

ovs-ofctl dump-flows br0

cookie=0x0, duration=104.689s, table=0, n_packets=0, n_bytes=0, in_port="vhost-client-1" actions=output:dpdk0
cookie=0x0, duration=99.573s, table=0, n_packets=4, n_bytes=924, in_port=dpdk0 actions=output:"vhost-client-1"

ovs-ofctl show br0

OFPT_FEATURES_REPLY (xid=0x2): dpid:0000b8cef64def2e
n_tables:254, n_buffers:0
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
 2(vhost-client-1): addr:00:00:00:00:00:00
     config:     0
     state:      LINK_DOWN
     speed: 0 Mbps now, 0 Mbps max
 6(dpdk0): addr:b8:ce:f6:4d:ef:2e
     config:     0
     state:      0
     current:    AUTO_NEG
     speed: 0 Mbps now, 0 Mbps max
 LOCAL(br0): addr:b8:ce:f6:4d:ef:2e
     config:     0
     state:      0
     current:    10MB-FD COPPER
     speed: 10 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

Libvirt XML configuration (relevant parts)

<domain type='kvm'>
  <name>ubuntu-server</name>
  <devices>
    <emulator>/usr/bin/qemu-system-x86_64</emulator>
    <interface type='vhostuser'>
      <mac address='52:54:00:16:a5:76'/>
      <source type='unix' path='/tmp/vhost-client-1' mode='server'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
  </devices>
</domain>

Which configuration option am I missing? I have followed several guides but still I am unable to route any traffic between my VMs.

I suspect that the problem is related to the LINK_DOWN status of the vhost-client-1 port as reported by the ovs-ofctl show command. I've tried to set that status as UP with the command ovs-ofctl mod-port br0 vhost-client-1 up. Even though the command did not fail, nothing changed.

Any thoughts?


Solution

  • Eventually, I managed to solve my problem. Vipin's answer was useful, but did not solve the issue. The configuration option I was missing was the numa option within the cpu element.

    I post the working configuration file just in case it is useful for other people. The first part is about memory backing (under the domain element):

      <memory unit='KiB'>[VM memory size]</memory>
      <currentMemory unit='KiB'>[VM memory size]</currentMemory>
      <memoryBacking>
        <hugepages>
          <page size='2048' unit='KiB'/>
        </hugepages>
        <locked/>
        <source type='file'/>
        <access mode='shared'/>
        <allocation mode='immediate'/>
        <discard/>
      </memoryBacking>
    

    But we also needed the numa configuration, even if our machine had just one processor:

    <cpu mode='custom' match='exact' check='full'>
        <model fallback='forbid'>qemu64</model>
        <feature policy='require' name='x2apic'/>
        <feature policy='require' name='hypervisor'/>
        <feature policy='require' name='lahf_lm'/>
        <feature policy='disable' name='svm'/>
        <numa>
          <cell id='0' cpus='0-1' memory='[VM memory size]' unit='KiB' memAccess='shared'/>
        </numa>
    </cpu>