Search code examples
cpointersc89terminationtype-punning

Read low pointer bit in way that could *probably* work on as many systems as possible


It seems that the low bit of pointers being 0 is more-or-less pretty portable (where portable obviously does not mean "standard", but that people get away with it and can use it to some advantage in some cases, hopefully disable-able with a compile switch).

Projects that want to get fiddly have used it, with less luck on the second lowest bit:

How portable is using the low bit of a pointer as a flag?

But let's say one doesn't want to just poke a bit of data or not into a pointer of a known type. What you wish you could do instead is to use that low bit being 0 to allow a pointer type to do "double-duty" as a terminator.

So your items look like this:

struct Item {
    uintptr_t flags; // low bit zero means "not an item"
    type1 field1;
    type2 field2;
    ...
};

Then you'd like to have a situation where some container of items looks like this:

[(flags field1 field2...) (flags field1 field2...) some-pointer stuff stuff...]

You'd be thus getting away with a "sunk-cost" (let's say some internal management pointer in the data structure for another purpose) doing your termination for you.


UPDATE: To be clearer on the situation: this is where one controls the codebase and structures. So any pointer in a structure used like this you could declare as a union type, for instance:

union Maybe_Terminator_Pointer {
    uintptr_t flags;
    type1* pointer1;
    type2* pointer2;
    ...
};

...and then use that, if it helps. Excluding char*s is fine, as they of course would not count.


So an extra type punning problem here is: the pointer being used to do the test-for-termination is an Item*, and the routine doing the checking doesn't know which sort of pointer some-pointer is specifically.

I'm wondering what--if any--is the best gamble is for being able to port and compile such a trick. That includes turning the pointers into unions, #ifdef'ing the endianness of the machine and getting a char* from the byte with the bit, etc. Whatever might be more likely to work, if anyone has experience or guesses.

Imagine it's worth the effort for your case, shaving off a large amount of data. And you have the backup scenario of if people compiling find the trick isn't working somewhere...an #ifdef could use full-sized items for terminators and waste the extra space. So wondering if there are any tips on to make this obviously-standards-violating trick have a better chance of working on more systems.


Solution

  • (Self-answering to provide more information and allow people to spot any potential problems with my alternative.)

    So wondering if there are any tips on to make this obviously-standards-violating trick have a better chance of working on more systems.


    Tip One (as per comments) is don't do this if you can possibly find another way.

    For example, "you" mention this layout:

    [(flags field1 field2...) (flags field1 field2...) some-pointer stuff stuff...]
    

    But is there anything in "stuff stuff" that isn't a pointer--perhaps boring old integers that are known to be even--where you could do the same trick? If so, why not reorder this like:

    [(flags field1 field2...) (flags field1 field2...) even-uintptr_t stuff...]
    

    That way when you read flags from either in an Item or not, it will be the same type. If you look around at things that aren't opaque like pointers, you might find obvious non-opaque numbers in the current code that are always even...for instance, a lot of byte counts measuring aggregates are things you likely are guaranteed to have as % 2 = 0.


    Tip Two applies to the above alternative--and perhaps helps the odds with the standards-breaking-pointer-version too. Be sure that when you write the value you go through an "aliasing" pointer, and do not write the field directly via . or ->.

    There is no requirement for the compiler to ensure memory coherence between two different structures on a field, just because they are the same type. Say struct A begins with an uintptr_t field_a and struct B begins with an uintptr_t field_b, and you put a pointer to both at the same address. If you do some_a->field_a = value;, then reading back from a some_b->field_b pointer at that address might well not see that update, because the compiler doesn't expect you to be writing B fields via an A pointer.

    Hence go through a pointer to do the write. Something like uintptr_t *alias = &some_a->field_a; and then *alias = value will enforce coherence with the successive reads of any integer (!). (Dissatisfaction with the performance consequences of this property of pointers is why restrict exists. If this trick is to work, it can only do so by exploiting the non-restrict behavior of pointers.)

    (!) - I think you only have to do the write through a pointer, and not the reads, but perhaps someone can provide insight.