Search code examples
clanguage-lawyerstandardsundefined-behavior

Is it UB to compare pointers casted from integers?


#define POINTER_TO_SOMETHING ((int*)(0x80000000))
#define BAD_POINTER ((int*)(0x12345678))

int *p = some_condition ? POINTER_TO_SOMETHING + 42 : BAD_POINTER;
if (p == BAD_POINTER) { ... }

Does this code have any undefined behavior or just implementation-defined behaviors?

If it helps -- the code runs in kernel mode managing page tables and others, and 0x80000000 plus the next 4KB is already registered in the page table, but I guess these hardware-specific information are irrelevant to a behavior being undefined or not, which is something C-specific.

==EDIT== I know it's UB to compare relationally if invaild pointer is involved:

6.5.8 Relational operators

shift-expression
relational-expression < shift-expression
relational-expression > shift-expression
relational-expression <= shift-expression
relational-expression >= shift-expression

When two pointers are compared, ...
In all other cases, the behavior is undefined. ...

But I find no such requirement for equality tests.

Converting to uintptr_t might reduce ambiguity, but that's not the question.


Solution

  • In the code, there are some things you could expect to be UB:

    // (1) conversion of integer to pointer is implementation-defined
    #define POINTER_TO_SOMETHING ((int*)(0x80000000))
    #define BAD_POINTER ((int*)(0x12345678))
    
    // (2) addition of 42 onto pointer is only allowed when this addition
    // advances the pointer within an array, otherwise UB
    int *p = some_condition ? POINTER_TO_SOMETHING + 42 : BAD_POINTER;
    
    // (3) equality comparison between pointers is never UB
    if (p == BAD_POINTER) { ... }
    

    It's impossible to tell whether this code contains UB without consulting the manual of your compiler. In general, the mapping from integers to pointers is implementation-defined. It's possible that POINTER_TO_SOMETHING is always considered to be a pointer to an almost infinitely large arrays by the compiler, in which case POINTER_TO_SOMETHING + 42 is safe. Otherwise, it may contain undefined behavior, because the + operator can only advance a pointer within an array.

    Unfortunately, even though integer to pointer conversions are implementation-defined behavior, and implementation-defined behavior must be documented by the compiler, GCC doesn't say much about it:

    A cast from integer to pointer discards most-significant bits if the pointer representation is smaller than the integer type, extends according to the signedness of the integer type if the pointer representation is larger than the integer type, otherwise the bits are unchanged.

    - https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html

    From this, we can see that GCC considers the mapping to be a bit-cast which possibly sign-extends, but nothing is said about what GCC thinks the pointer points to.

    Definition of behavior for cases (1), (2), (3)

    (1) Integer to pointer conversion

    An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might produce an indeterminate representation when stored into an object

    - C23 6.3.2.3 p5

    (2) Addition of integers onto pointers

    [...] If the pointer operand and the result do not point to elements of the same array object or one past the last element of the array object, the behavior is undefined. [...]

    - C23 6.5.6 p9

    (3) Equality comparison between pointers

    Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

    - C23 6.5.9 p7

    (4) Relational comparison between pointers

    The rules for <=, >, etc. are different compared to equality comparison. This is UB if the pointers don't point to within the same object or array.

    - C23 6.5.8 p6