Test Setup: Linux-Server-1 Port-A <==> Port 1 DPDK-Server-2 Port 2 <==> Port B Linux-Server-2
.
Steps Followed:
Network devices using DPDK-compatible driver
============================================
0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=uio_pci_generic unused=ixgbe,vfio-pci
0000:03:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' drv=uio_pci_generic unused=ixgbe,vfio-pci
Network devices using kernel driver
===================================
0000:05:00.0 'I210 Gigabit Network Connection 1533' if=enp5s0 drv=igb unused=vfio-pci,uio_pci_generic *Active*
0000:06:00.0 'I210 Gigabit Network Connection 1533' if=enp6s0 drv=igb unused=vfio-pci,uio_pci_generic
Issue: Port 2 of DPDK-Server is returning down app_ports_check_link
[EDIT] Running with DPDK example, I am able to get packets send to DPDK port 1 and port 2.
Log for eventdev:
EAL: PCI device 0000:03:00.0 on NUMA socket 0
EAL: probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:03:00.1 on NUMA socket 0
EAL: probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:05:00.0 on NUMA socket 0
EAL: probe driver: 8086:1533 net_e1000_igb
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:1533 net_e1000_igb
USER1: Creating the mbuf pool ...
USER1: Initializing NIC port 0 ...
USER1: Initializing NIC port 1 ...
USER1: Port 0 (10 Gbps) UP
USER1: Port 1 (0 Gbps) DOWN
PANIC in app_ports_check_link():
Some NIC ports are DOWN
8: [./build/pipeline(_start+0x2a) [0x558dc37c1d8a]]
7: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f318e9f5b97]]
6: [./build/pipeline(main+0x7a) [0x558dc37c1fa4]]
5: [./build/pipeline(_Z8app_initv+0x18) [0x558dc37c2940]]
4: [./build/pipeline(+0x8c909) [0x558dc37c2909]]
3: [./build/pipeline(+0x8c677) [0x558dc37c2677]]
2: [./build/pipeline(__rte_panic+0xc5) [0x558dc37b4a90]]
1: [./build/pipeline(rte_dump_stack+0x2e) [0x558dc385954e]]
fish: “sudo ./build/pipeline” terminated by signal SIGABRT (Abort)
Code
static void
app_ports_check_link(void)
{
uint32_t all_ports_up, i;
all_ports_up = 1;
for (i = 0; i < app.n_ports; i++) {
struct rte_eth_link link;
uint16_t port;
port = app.ports[i];
memset(&link, 0, sizeof(link));
rte_eth_link_get_nowait(port, &link);
RTE_LOG(INFO, USER1, "Port %u (%u Gbps) %s\n",
port,
link.link_speed / 1000,
link.link_status ? "UP" : "DOWN");
if (link.link_status == ETH_LINK_DOWN)
all_ports_up = 0;
}
if (all_ports_up == 0)
rte_panic("Some NIC ports are DOWN\n");
}
static void
app_init_ports(void)
{
uint32_t i;
struct rte_eth_conf port_conf = app_port_conf_init();
struct rte_eth_rxconf rx_conf = app_rx_conf_init();
struct rte_eth_txconf tx_conf = app_tx_conf_init();
(void)tx_conf;
/* Init NIC ports, then start the ports */
for (i = 0; i < app.n_ports; i++) {
uint16_t port;
int ret;
port = app.ports[i];
RTE_LOG(INFO, USER1, "Initializing NIC port %u ...\n", port);
/* Init port */
ret = rte_eth_dev_configure(
port,
1,
1,
&port_conf);
if (ret < 0)
rte_panic("Cannot init NIC port %u (%s)\n",
port, rte_strerror(ret));
rte_eth_promiscuous_enable(port);
/* Init RX queues */
ret = rte_eth_rx_queue_setup(
port,
0,
app.port_rx_ring_size,
rte_eth_dev_socket_id(port),
&rx_conf,
app.pool);
if (ret < 0)
rte_panic("Cannot init RX for port %u (%d)\n",
(uint32_t) port, ret);
/* Init TX queues */
ret = rte_eth_tx_queue_setup(
port,
0,
app.port_tx_ring_size,
rte_eth_dev_socket_id(port),
NULL);
if (ret < 0)
rte_panic("Cannot init TX for port %u (%d)\n",
(uint32_t) port, ret);
/* Start port */
ret = rte_eth_dev_start(port);
if (ret < 0)
rte_panic("Cannot start port %u (%d)\n", port, ret);
}
app_ports_check_link();
}
[EDIT] 2020/7/1 Update
run $RTE_SDK/examples/skeleton/build/basicfwd -l 1
, I got the following:
EAL: Detected 24 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: PCI device 0000:03:00.0 on NUMA socket 0
EAL: probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:03:00.1 on NUMA socket 0
EAL: probe driver: 8086:10fb net_ixgbe
EAL: PCI device 0000:05:00.0 on NUMA socket 0
EAL: probe driver: 8086:1533 net_e1000_igb
EAL: PCI device 0000:06:00.0 on NUMA socket 0
EAL: probe driver: 8086:1533 net_e1000_igb
Port 0 MAC: 9c 69 b4 60 90 26
Port 1 MAC: 9c 69 b4 60 90 27
Core 1 forwarding packets. [Ctrl+C to quit]
recv pkts num: 1, port: 0
================= Ether header ===============
srcmac: 9C:69:B4:60:90:17
dstmac: 33:33:00:00:00:16
ethertype: 34525
This packet is IPv6
================= Ether header ===============
srcmac: 9C:69:B4:60:90:17
dstmac: 33:33:00:00:00:16
ethertype: 34525
This packet is IPv6
send 1 pkts, port: 1
recv pkts num: 1, port: 1
================= Ether header ===============
srcmac: 9C:69:B4:60:90:1C
dstmac: 33:33:00:00:00:16
ethertype: 34525
This packet is IPv6
================= Ether header ===============
srcmac: 9C:69:B4:60:90:1C
dstmac: 33:33:00:00:00:16
ethertype: 34525
This packet is IPv6
send 1 pkts, port: 0
recv pkts num: 1, port: 1
================= Ether header ===============
srcmac: 9C:69:B4:60:90:1C
dstmac: 33:33:00:00:00:16
ethertype: 34525
This packet is IPv6
================= Ether header ===============
srcmac: 9C:69:B4:60:90:1C
dstmac: 33:33:00:00:00:16
ethertype: 34525
This packet is IPv6
send 1 pkts, port: 0
...
It seems that there is no problem with the two ports. Strange!
[EDIT] 2020/7/2 Update
After replacing rte_eth_link_get_nowait
with rte_eth_link_get
, the program can work normally.
Following @Vipin Varghese's suggestion, I have checked the ports' settings with ethtool DEVNAME
and ethtool -a DEVNAME
:
Settings for ens1f1:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes: 10000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 10000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
Settings for ens1f0:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 1000baseT/Full
10000baseT/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: 10000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
Autonegotiate: off
RX: on
TX: on
But I'm really confused:
rte_eth_link_get_nowait
and rte_eth_link_get
? DPDK doc. Why autoneg
can make them behave differently?Explanation:
ethtool -while application is down is not a trusted way. Depending upon DPDK version
rte_eth_dev_closeor
rte_cleanup` would not have put the NIC in the right state.a. Server-3 port might be auto-negotiating with DPDK port-1 leading to rte_eth_link_get_nowait
to report as down
. (right API is to invoke rte_eth_link_get).
b. The Server-3 port might manually be configured in non-duplex and non 10G mode.
the right way to debug is to
no auto-neg, 10G, full-duplex
ethtool -t
for port-B on server-3 to cross the results too.note: this will help you identify if it server-3 ports driver/firmware which acts differently with auto-neg as the ports are sending and receiving packets is successful with example/skeleton
with command $RTE_SDK/examples/skeleton/build/basicfwd -l 1
[EDIT-1] based on the update from the comment it looks like rte_eth_link_get_nowait
is the fast approach, the right one is to be used with rte_eth_link_get
. Requested for online debug with the author
[EDIT-2] based on the comment rte_eth_link_get
has done the desired job. As I recollect rte_eth_link_get
wait for the actual readout from physical device registers, while rte_eth_link_get_nowait
is invoked without wait. hence the right values are populated for rte_eth_link_get
.