Search code examples
c++new-operatordereference

What's the standard way to avoid constant dereferencing after using `new` keyword?


The new keyword hands you back a pointer to the object created, which means you keep having to deference it - I'm just afraid performance may suffer.

E.g. a common situation I'm facing:

class cls {
    obj *x; ...
}

// Later, in some member function:
x = new obj(...);
for (i ...) x->bar[i] = x->foo(i + x->baz);  // much dereferencing

I'm not overly keen on reference variables either as I have many *x's (e.g. *x, *y, *z, ...) and having to write &x_ref = *x, &y_ref = *y, ... at the start of every function quickly becomes tiresome and verbose.

Indeed, is it better to do:

class cls {
    obj x; ...    // not pointer
}
x_ptr = new obj(...);
x = *x_ptr;       // then work with x, not pointer;

So what's the standard way to work with variables created by new?


Solution

  • There's no other way to work with objects created by new. The location of the unnamed object created by new is always a run-time value. This immediately means that each and every access to such an object will always unconditionally require dereferencing. There's no way around it. That is what "dereferencing" actually is, by definition - accessing through a run-time address.

    Your attempts to "replace" pointers with references by doing &x_ref = *x at the beginning of the function are meaningless. They achieve absolutely nothing. References in this context are just syntactic sugar. They might reduce the number of * operators in your source code (and might increase the number of & operators), but they will not affect the number of physical dereferences in the machine code. They will lead to absolutely the same machine code containing absolutely the same amount of physical dereferencing and absolutely the same performance.

    Note that in contexts where dereferencing occurs repeatedly many times, a smart compiler might (and will) actually read and store the target address in a CPU register, instead of re-reading it each time from memory. Accessing data through an address stored in a CPU register is always the fastest, i.e. it is even faster than accessing data through compile-time address embedded into the CPU instruction. For this reason, repetitive dereferencing of manageable complexity might not have any negative impact on performance. This, of course, depends significantly on the quality of the compiler.

    In situations when you observe significant negative impact on performance from repetitive dereferencing, you might try to cache the target value in a local buffer, use the local buffer for all calculations and then, when the result is ready, store it through the original pointer. For example, if you have a function that repeatedly accesses (reads and/or writes) data through a pointer int *px, you might want to cache the data in an ordinary local variable x

    int x = *px;
    

    work with x throughout the entire function and at the end do

    *px = x;
    

    Needless to say, this only makes sense when the performance impact from copying the object is low. And of course, you have to be careful with such techniques in aliased situations, since in this case the value of *px is not maintained continuously. (Note again, that in this case we use an ordinary variable x, not a reference. Your attempts to replace single-level pointers with references achieve nothing at all.)

    Again, this sort of "data cashing" optimization can also be implicitly performed by the compiler, assuming the compiler has good understanding of the data aliasing relationships present in the code. And this is where C99-style restrict keyword can help it a lot. But that's a different topic.

    In any case, there's no "standard" way to do that. The best approach depends critically on your knowledge of data flow relationships that exist in each specific piece of your code.