Search code examples
clanguage-lawyer

Pointing to before the beginning of the array in a for loop


Consider the following C code:

for (data_object_t *first = get_first(dev),
                   *last  = first + get_num_of_data_objects(dev) - 1,
                   *curr  = first;
                    curr <= last;
                    curr++) {
        // process *curr
    }

Assume that get_first() returns a pointer to an object sitting within some big array of data_object_t's, and that get_num_of_data_objects() never returns a value that will cause out of bound access.

My question is specifically about the case where get_num_of_data_objects() returns 0.
In this case, last will be first - 1, so I'd expect the loop to simply be skipped (because curr <= last evaluates to false right from the start), but what was bothering me is the potential for having last pointing to before the beginning of the array, which made me wonder - even if it's not being dereferenced:
Is this a bad practice?
Is there a potential for undefined behavior here?


Solution

  • If first points to the start of an array, first - 1 attempts to create a pointer to before the beginning of the array. In this situation, the behaviour of first - 1 is undefined.

    The rules regarding pointer addition and subtraction are spelled out in section 6.5.6p8 of the C standard:

    When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

    These rules do not allow for creating a pointer to before the beginning of an array.

    What they do allow for however is to have a pointer to one element past the end of the array (although it can't be dereferenced). So you can use that to modify your loop as follows:

    for (data_object_t *first = get_first(dev),
                       *last  = first + get_num_of_data_objects(dev),
                       *curr  = first;
                        curr < last;
                        curr++) {
            // process *curr
        }