Search code examples
cmemoryundefined-behavior

Writing out of boundary matrix


The exercise in an exam required writing on paper the values that the memory obtains after executing the following program. The issue is that in this program, we are accessing unallocated memory when the for loop reaches element 3. Whether the value written to unallocated memory is actually stored or is undefined behavior, and thus it is correct not to write it since we don't know what actually happens. Following the programm:

#include <stdio.h>

int main()
{
  int k[5][7] = {0};
  int *p, *z;
 
  p = k[1];
  *p = 12;
 
  for (int i = 0; i < 4; i++){
    z = p + 8;
    *z = *p + 2;
    p = z;
  }

  return 0;
}

Running the program, I get a segmentation fault, but in the exam correction, the professor still assigned the value written in the unallocated memory area. So is it right to write on paper the value that we are writing to a unallocated memory ?


Solution

  • The rules for pointer arithmetic (including the use of []) are specified in the C standard chapter for the additive operators (C17 6.5.6 §8). The most important part:

    If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

    • p points at the array k[1]. For the above rules, it is not relevant that k is a multi-dimensional array.
    • p is only allowed to do pointer arithmetic within the bounds of this array k[1].
    • As a special rule, p is allowed to do pointer arithmetic exactly 1 item past the end of the array. So p + 7 would be well-defined.
    • But p + 8 is undefined behavior.

    Basically, the multi-dimensional array is guaranteed to be allocated adjacently in memory. There is a valid, allocated int location past the end of the inner array k[1]. But we may not calculate the address of that "past the end" item using pointer arithmetic based on a pointer pointing at k[1]. The pointer arithmetic itself may result in incorrect code getting generated since the compiler is free to assume that any access to k[1] through p/z will not change items in the array k[2].

    For example we could add this code to the end of the loop:

    if(k[2][1]==0)
      puts("zero");
    

    And the compiler is then in theory free to replace it with a non-conditional puts("zero") since nothing in the code is allowed to change this array item.

    The correct answer to your exam question is therefore: it is undefined behavior and anything can happen.