A local variable (say an int) can be stored in a processor register, at least as long as its address is not needed anywhere. Consider a function computing something, say, a complicated hash:
int foo(int const* buffer, int size)
{
int a; // local variable
// perform heavy computations involving frequent reads and writes to a
return a;
}
Now assume that the buffer does not fit into memory. We write a class for computing the hash from chunks of data, calling foo
multiple times:
struct A
{
void foo(int const* buffer, int size)
{
// perform heavy computations involving frequent reads and writes to a
}
int a;
};
A object;
while (...more data...)
{
A.foo(buffer, size);
}
// do something with object.a
The example may be a bit contrived. The important difference here is that a
was a local variable in the free function and now is a member variable of the object, so the state is preserved across multiple calls.
Now the question: would it be legal for the compiler to load a
at the beginning of the foo
method into a register and store it back at the end? In effect this would mean that a second thread monitoring the object could never observe an intermediate value of a
(synchronization and undefined behavior aside). Provided that speed is a major design goal of C++, this seems to be reasonable behavior. Is there anything in the standard that would keep a compiler from doing this? If no, do compilers actually do this? In other words, can we expect a (possibly small) performance penalty for using a member variable, aside from loading and storing it once at the beginning and the end of the function?
As far as I know, the C++ language itself does not even specify what a register is. However, I think that the question is clear anyway. Whereever this matters, I appreciate answers for a standard x86 or x64 architecture.
The compiler can do that if (and only if) it can prove that nothing else will access a
during foo
's execution.
That's a non-trivial problem in general; I don't think any compiler attempts to solve it.
Consider the (even more contrived) example
struct B
{
B (int& y) : x(y) {}
void bar() { x = 23; }
int& x;
};
struct A
{
int a;
void foo(B& b)
{
a = 12;
b.bar();
}
};
Looks innocent enough, but then we say
A baz;
B b(baz.a);
baz.foo(b);
"Optimising" this would leave 12
in baz.a
, not 23
, and that is clearly wrong.