Search code examples
clanguage-lawyerpointer-arithmetic

Pointer to one before first element of array


It is said in C that when pointers refer to the same array or one element past the end of that array the arithmetics and comparisons are well defined. Then what about one before the first element of the array? Is it okay so long as I do not dereference it?

Given

int a[10], *p;
p = a;

(1) Is it legal to write --p?

(2) Is it legal to write p-1 in an expression?

(3) If (2) is okay, can I assert that p-1 < a?

There is some practical concern for this. Consider a reverse() function that reverses a C-string that ends with '\0'.

#include <stdio.h>

void reverse(char *p)
{
    char *b, t;

    b = p;
    while (*p != '\0')
        p++;
    if (p == b)      /* Do I really need */
        return;      /* these two lines? */
    for (p--; b < p; b++, p--)
        t = *b, *b = *p, *p = t;
}

int main(void)
{
    char a[] = "Hello";

    reverse(a);
    printf("%s\n", a);
    return 0;
}

Do I really need to do the check in the code?

Please share your ideas from language-lawyer/practical perspectives, and how you would cope with such situations.


Solution

  • (1) Is it legal to write --p?

    It's "legal" as in the C syntax allows it, but it invokes undefined behavior. For the purpose of finding the relevant section in the standard, --p is equivalent to p = p - 1 (except p is only evaluated once). Then:

    C17 6.5.6/8

    If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

    The evaluation invokes undefined behavior, meaning it doesn't matter if you de-reference the pointer or not - you already invoked undefined behavior.

    Furthermore:

    C17 6.5.6/9:

    When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object;

    If your code violates a "shall" in the ISO standard, it invokes undefined behavior.

    (2) Is it legal to write p-1 in an expression?

    Same as (1), undefined behavior.


    As for examples of how this could cause problems in practice: imagine that the array is placed at the very beginning of a valid memory page. When you decrement outside that page, there could be a hardware exception or a pointer trap representation. This isn't a completely unlikely scenario for microcontrollers, particularly when they are using segmented memory maps.