Search code examples
cpointersundefined-behaviorlocal-variables

Why does returning local array pointers work whenever I first iterate through the array?


I played around with returning array pointers to gain greater understanding of how they work and found something I cannot explain.

Why does the following code work whenever I iterate through the return array before actually returning it (see the conditional section)?
Is it because C is lazy and does not actually write the array in memory as it thinks that it'll never get accessed anyways?

#include <stdio.h>

//returns pointer to array terminated by (-1)
int* thisDoesntMakeAnySense(int randomNumber)  {
    int array[randomNumber + 1];

    for(int i = 0; i < randomNumber; i++) {
        array[i] = i;
    }
    array[randomNumber] = -1;

#ifdef ENABLE_CLUDGE
    int i = 0;
    while(array[i] != -1) {
        printf("%d\n", array[i]);
        i++;
    }
#endif
    return array;
}

int main() {
    int* test = thisDoesntMakeAnySense(3);

    while(*test != -1) {
        printf("%d\n", *test);
        test++;
    }
    return 0;
}

Solution

  • Yes, when your compiler sees the array is not read before it is returned, it eliminates the code that writes to the array.

    This is not laziness; it is optimization. The compiler has to do more work to figure these things out. A goal of doing more work in the compiler is to create a program that does less work (or uses less space or other resources) when it executes.

    This behavior is not guaranteed by the C standard, but is a common feature of compilers.

    Note that the compiler is able to ignore the reads of the array that you attempt in main because these reads are not defined by the C standard. In thisDoesntMakeAnySense, the array array is created as the function begins, and it is (in the model of computing used by the C standard) destroyed when the function returns. Although the function returns a pointer to the array, the array does not exist (in the model), and the pointer is not valid. Because the pointer is not valid, the compiler is not required to give any meaning to the code in main that uses the pointer. Therefore, the compiler is permitted by the C standard to reason that this program writes to the array but never reads from it, not even in main, and therefore the writes to the array have no effect and may be removed.

    In general, a compiler is not required to generate a program that executes your program exactly as its source code is written. The C standard only requires the compiler (or your C implementation generally, including all the standard headers and libraries and supporting software) to generate a program that has the same observable behavior as your source code does. Observable behavior includes:

    • Data written to files.
    • Input and output interactions.
    • Accesses to volatile objects.

    Thus, the output of a printf statement is observable behavior. But writing into an array that is never used is not observable behavior, so the compiler is not required to generate code for it.

    In regard to the code that happens to make your program appears to work, that code reads from the array and writes to output (which is observable behavior). So the program must1 write that output. Your compiler apparently implements this by actually creating the array, writing to it, then reading the array and writing it to standard output. Then, upon returning to main, because the array was created and its data in memory has not yet been altered, the code in main to print it happens to “work.” However, the compiler could have implemented the required observable behavior by simply printing the output using a constant string generated at compile time and not using an array at all. In this case, the code in main would fail.

    Footnote

    1 The program must write the output if the program is otherwise okay. However, the fact that there is undefined behavior later in the program’s execution poisons the upstream execution. If program control enters a path on which there is, unconditionally, undefined behavior, the behavior of that entire path is undefined.