Search code examples
c++classpointerspointer-arithmeticfield-accessors

How is the compiler tricked into providing a pointer to the enclosing class?


I was reading an article on how C++ does not have field accessors as part of the language.

At the end of the post, the author gives a macro based solution that emulates field accessors for classes:

// a little trick to fool compiler we are not accessing NULL pointer here
#define property_offset(type, name) \
  (((char*)&((type*)(0xffff))->name) - (char*)(0xffff))

#define property_parent(type, name) \
  ((type*)((char*)(this) - property_offset(type, name)))

// macro defining property
#define property(type, name, parent)                                         \
  struct name##_property {                                                   \
    operator type() { return property_parent(parent, name)->get_##name(); }  \
    void operator=(type v) { property_parent(parent, name)->set_##name(v); } \
                                                                             \
   private:                                                                  \
    char zero[0];                                                            \
  } name

// our main class
class Node {

  /* visitCount will act as a field accessor */
  property(int, visitCount, Node);
};

When I run this through the preprocessor, I get:

class Node {

  struct visitCount_property {
    operator int() { return ((Node*)((char*)(this) - (((char*)&((Node*)(0xffff))->visitCount) - (char*)(0xffff))))->get_visitCount(); }
    void operator=(int v) { ((Node*)((char*)(this) - (((char*)&((Node*)(0xffff))->visitCount) - (char*)(0xffff))))->set_visitCount(v); }    
    private: char zero[0];
    } visitCount;
};  

The idea being that I would have also added my own implementations of:

int get_visitCount();
void set_visitCount(int v); 

And it would look as if visitCount was being directly accessed.
However, the functions would actually be called behind the scenes:

Node n;
n.visitCount = 1;     //actually calls set method
cout << n.VisitCount; //actually calls get method  

I'd like to know more about this trick of accessing the enclosing class:

((Node*)((char*)(this) - (((char*)&((Node*)(0xffff))

What is the relevance of 0xffff?
In decimal that is: 65535.

How does this trick the compiler to accessing the class that encloses the visitCount class?

I also see that this does not work on MSVC, so I was wondering if there was a standard way of accomplishing what this hack is doing.


Solution

  • There is no relevance of 0xffff. It's just some number. It could be zero (and in fact would be easier if it were). Let's break this up into pieces and rewrite 0xffff to be addr:

    (((char*)&((type*)(addr))->name) - (char*)(addr))
    

    (type*)(addr) just gives us some type* that starts at addr. It's a reinterpret_cast. So let's call that obj:

     (((char*)&obj->name) - (char*)(addr))
    

    We can even replace the second addr with obj - it's the wrong type, but we're casting anyway, and it adds clarity to what's going on:

     (((char*)&obj->name) - (char*)(obj))
    

    &obj->name just gives us a pointer to that particular member, so let's call that mem_ptr

    ((char*)mem_ptr) - (char*)(obj)
    

    Now it's clear - we're taking the address of the member (as a char*) and subtracting the address of the parent object (as a char*). We used 0xffff just to have the same initial address in both places.

    Note that the C++ standard also defines a macro for this directly. It's called offsetof. With the caveat "If type is not a standard-layout class (Clause 9), the results are undefined."