scalability virtualization time-measurement

Measuring time in a virtualized environment

I developed a series of microbenchmarks using some shared-memory libraries (e.g. openmp, tbb, etc) to check how they scale varying the number of threads.

Currently I'm running them on a 4-core processor, the results are pretty reasonable, but I only got 3 points on a speedup plot.

To get more data and a more wide analysis of them I'm planning to run them on a 32-core machine.

One of the possibilities is to buy a 32-core processor, like the AMD Epyc or Intel Xeon, they are kinda expensive, but I know what I'll get with them. My second and less expensive alternative is to run them on a cloud, like the Amazon AWS or Microsoft Azure.

Then, before making my choice I need some clarification:

As far as I understand AWS can make a machine with as many cores as I want, but all of them are virtualized.

When I run an application there how reliable are the time measure of its execution?

Will I get the same scalability that I get when I run the application on the real 32-core processor?

Solution

From decades of experience with virtualization performance, this is an area to be cautious. A lot will depend on the level of contention involved between your virtual machine and others, which in many cloud environments, is difficult to know without tooling. Also, it isn't clear whether you are discussing elapsed time and/or processor time. Both can be influenced by virtualization, though my experience is that elapsed time is more variable. I can't speak to the listed environments, but in IBM Z virtualization solutions, we provide metrics that cover processor time consumed by the virtual machine and that consumed by the hypervisor. For your purposes, you'd want just that consumed by the virtual machine. Sorry, I don't know if either of the platforms you mentioned provide that information. In these type of experiments, we often find it useful to do more measurement iterations to see run time variability.