performance linux-kernel network-programming benchmarking perf

Network performance issues and slow tcp_write_xmit/tcp_ack syscalls with a lot of save_stack calls on OpenVZ kernel

I ran into a trouble with a bad network performance on Centos. The issue was observed on the latest OpenVZ RHEL7 kernel (3.10 based) on Dell server with 24 cores and Broadcom 5720 NIC. No matter it was host system or OpenVZ container. Server receives RTMP connections and reproxy RTMP streams to another consumers. Reads and writes was unstable and streams froze periodically for few seconds.

I've started to check system with strace and perf. Strace affects system heavily and seems that only perf may help. I've used OpenVZ debug kernel with debugfs enabled. System spends too much time in swapper process (according to perf data). I've built flame graph for the system under the load (100mbit in data, 200 mbit out) and have noticed that kernel spent too much time in tcp_write_xmit and tcp_ack. On the top of these calls I see save_stack syscalls.

On another hand, I tested the same scenario on Amazon EC2 instance (latest Amazon Linux AMI 2017.09) and perf doesn't track such issues. Total amount of samples was 300000, system spends 82% of time according to perf samples in swapper, but net_rx_action (and as consequent tcp_write_xmit and tcp_ack) in swapper takes only 1797 samples (0.59% of total amount of samples). On the top of net_rx_action call in flame graph I don't see any calls related to stack traces.

Output of OpenVZ system looks differently. Among 1833152 samples 500892 (27%) was in swapper process, 194289 samples (10.5%) was in net_rx_action.

Full svg of calls on vzkernel7 is here and svg of EC2 instance calls is here. You may download it and open in browser to interactively check flame graph.

So, I want to ask for help and I have few questions.

Why flame graph from EC2 instance doesn't contain so much save_stack calls like my server?
Does perf forces system to call save_stack or it's some kernel setting? May it be disabled and how?
Does Xen on EC2 guest process all tcp_ack and other syscalls? Is it possible that host system on EC2 server makes some job and guest system doesn't see it?

Thank you for a help.

Solution

I've read kernel sources and have an answer for my questions.

save_stack calls is caused by the Kernel Address Sanitizer feature that was enabled in OpenVZ debug kernel by CONFIG_KASAN option. When this options is enabled, on each kmem_cache_free syscall kernel calls __cache_free

static inline void __cache_free(struct kmem_cache *cachep, void *objp,
            unsigned long caller)
{
    /* Put the object into the quarantine, don't touch it for now. */
    if (kasan_slab_free(cachep, objp))
        return;

    ___cache_free(cachep, objp, caller);
}

With CONFIG_KASAN disabled kasan_slab_free will response with false (check include/linux/kasan.h). OpenVZ debug kernel was built with CONFIG_KASAN=y, Amazon AMI wasn't.