How to use perf tool with docker running stress-ng?

I am using stress-ng docker image from https://hub.docker.com/r/polinux/stress-ng/dockerfile to stress my system. I want to use perf tool to monitor metrics.

perf stat -- stress-ng --cpu 2 --timeout 10 runs stress-ng for 10 seconds and returns performance metrics. I tried to do the same with the docker image by using perf stat -- docker run -ti --rm polinux/stress-ng --cpu 2 --timeout 10. This returns metrics but not the metrics of stress-ng.

The output I got when using 'perf stat' on stress-ng:

Performance counter stats for 'stress-ng --cpu 2 --timeout 10':

  19975.863889      task-clock (msec)         #    1.992 CPUs utilized          
         2,057      context-switches          #    0.103 K/sec                  
             7      cpu-migrations            #    0.000 K/sec                  
         8,783      page-faults               #    0.440 K/sec                  
52,568,560,651      cycles                    #    2.632 GHz                    
89,424,109,426      instructions              #    1.70  insn per cycle         
17,496,929,762      branches                  #  875.904 M/sec                  
    97,910,697      branch-misses             #    0.56% of all branches        

  10.025825765 seconds time elapsed

The output I got when using perf tool on docker image:

Performance counter stats for 'docker run -ti --rm polinux/stress-ng --cpu 2 --timeout 10':

    154.613610      task-clock (msec)         #    0.014 CPUs utilized          
           858      context-switches          #    0.006 M/sec                  
           113      cpu-migrations            #    0.731 K/sec                  
         4,989      page-faults               #    0.032 M/sec                  
   252,242,504      cycles                    #    1.631 GHz                    
   375,927,959      instructions              #    1.49  insn per cycle         
    84,847,109      branches                  #  548.769 M/sec                  
     1,127,634      branch-misses             #    1.33% of all branches        

  10.704752134 seconds time elapsed

Can someone please help me with how to get metrics of stress-ng when run using docker?

Solution

Carrying on from comments by @osgx,

As is mentioned here, by default, the perf stat command will monitor not only all the threads of the process to be monitored, but also its child processes and threads.

The problem in this situation is that by running perf stat and monitoring the docker run stress-ng command, you are not monitoring the actual stress-ng process. It is important to note that, the processes running as part of the container, will actually not be started by the docker client, but rather by the docker-containerd-shim process (which is a grandchild process of the dockerd process).

If you run the docker command to run stress-ng inside the container and observe the process-tree, it becomes evident.

docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 100

ps -elf | grep docker

0 S ubuntu    26379 114001  0  80   0 - 119787 futex_ 12:33 pts/3   00:00:00 docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 10000
4 S root      26431 118477  0  80   0 -  2227 -      12:33 ?        00:00:00 docker-containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/72a8c2787390669ff4eeae6f343ab4f9f60434f39aae66b1a778e78b7e5e45d8 -address /var/run/docker/containerd/docker-containerd.sock -containerd-binary /usr/bin/docker-containerd -runtime-root /var/run/docker/runtime-runc
0 S ubuntu    26610  26592  0  80   0 -  3236 pipe_w 12:34 pts/6    00:00:00 grep --color=auto docker
4 S root     118453      1  3  80   0 - 283916 -     May02 ?        01:01:57 /usr/bin/dockerd -H fd://
4 S root     118477 118453  4  80   0 - 457853 -     May02 ?        01:14:36 docker-containerd --config /var/run/docker/containerd/containerd.toml

----------------------------------------------------------------------

ps -elf | grep stress-ng

0 S ubuntu    26379 114001  0  80   0 - 119787 futex_ 12:33 pts/3   00:00:00 docker run -ti --name=stress-ng --rm polinux/stress-ng --cpu 2 --timeout 10000
4 S root      26455  26431  0  80   0 - 16621 -      12:33 pts/0    00:00:00 /usr/bin/stress-ng --cpu 2 --timeout 10000
1 R root      26517  26455 99  80   0 - 16781 -      12:33 pts/0    00:01:08 /usr/bin/stress-ng --cpu 2 --timeout 10000
1 R root      26518  26455 99  80   0 - 16781 -      12:33 pts/0    00:01:08 /usr/bin/stress-ng --cpu 2 --timeout 10000
0 S ubuntu    26645  26592  0  80   0 -  3236 pipe_w 12:35 pts/6    00:00:00 grep --color=auto stress-ng

The PPID of the first stress-ng process is 26431, which is not the docker run command, but actually the docker-containerd-shim process. Monitoring the docker run command will never reflect correct values, because the docker client is completely detached from the process of starting the stress-ng commands.

One way to get around this problem would be to attach the perf stat command to the PIDs of the stress-ng processes that are started by the docker runtime.

eg, as in the above case, once the docker run command is started, you can immediately start doing this -

perf stat -p 26455,26517,26518

 Performance counter stats for process id '26455,26517,26518':

     148171.516145      task-clock (msec)         #    1.939 CPUs utilized          
                49      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
                67      page-faults               #    0.000 K/sec

You may increase the --timeout a little bit so that the command runs longer, since you are now starting perf stat post starting stress-ng. Also you have to account for a small fraction of the initial measuring time lost.

The other way would be to run perf stat inside the docker container, something like a docker run perf stat ..., but for that you would have to start providing privileges to your container, since, by default, the perf_event_open system call is blacklisted in docker. You can read this answer here.