Why is assignment slower when there's an implicit conversion?

If there was similar questions please direct me there, I searched quiet some time but didn't find anything.

Backround:

I was just playing around and found some behavior I can't completely explain... For primitive types, it looks like when there's an implicit conversion, the assignment operator = takes longer time, compared to an explicit assignment.

int iTest = 0;
long lMax = std::numeric_limits<long>::max();
for (int i=0; i< 100000; ++i)
{
    // I had 3 such loops, each running 1 of the below lines.
    iTest = lMax;
    iTest = (int)lMax;
    iTest = static_cast<int>(lMax);
}

The result is that the c style cast and c++ style static_cast performs the same on average (differs each time, but no visible difference). AND They both outperforms the implicit assignment.

Result:
iTest=-1, lMax=9223372036854775807
(iTest = lMax) used 276 microseconds

iTest=-1, lMax=9223372036854775807
(iTest = (int)lMax) used 191 microseconds

iTest=-1, lMax=9223372036854775807
(iTest = static_cast<int>(lMax)) used 187 microseconds

Question:

Why is the implicit conversion results in larger latency? I can guess it has to be detected in the assignment that int overflows, so adjusted to -1. But what exactly is going on in the assignment?

Thanks!

Solution

If you want to know why something is happening under the covers, the best place to look is ... wait for it ... under the covers :-)

That means examining the assembler language that is produced by your compiler.

A C++ environment is best thought of as an abstract machine for running C++ code. The standard (mostly) dictates behaviour rather than implementation details. Once you leave the bounds of the standard and start thinking about what happens underneath, the C++ source code is of little help anymore - you need to examine the actual code that the computer is running, the stuff output by the compiler (usually machine code).

It may be that the compiler is throwing away the loop because it's calculating the same thing every time so only needs do it once. It may be that it throws away the code altogether if it can determine you don't use the result.

There was a time many moons ago, when the VAX Fortran compiler (I did say many moons) outperformed its competitors by several orders of magnitude in a given benchmark.

That was for that exact reason. It had determined the results of the loop weren't used so had optimised the entire loop out of existence.

The other thing you might want to watch out for is the measuring tools themselves. When you're talking about durations of ¹/_10,000^th of a second, your results can be swamped by the slightest bit of noise.

There are ways to alleviate these effects such as ensuring the thing you're measuring is substantial (over ten seconds for example), or using statistical methods to smooth out any noise.

But the bottom line is, it may be the measuring methodology causing the results you're seeing.