Search code examples
cstrcpynull-terminated

C: Writing 4 bytes into a region of size 3 overflows the destination?


My simple C program is as follows. Initially, I've defined variable buf1 with 3 char.

I don't have any problem with 2 char such as AB or XY

user@linux:~/c# cat buff.c; gcc buff.c -o buff; echo -e '\n'; ./buff
#include <stdio.h>
#include <string.h>

int main() {
        char buf1[3] = "AB";
        printf("buf1 val:  %s\n", buf1);
        printf("buf1 addr: %p\n", &buf1);
        strcpy(buf1,"XY");
        printf("buf1 val:  %s\n", buf1);
}

buf1 val:  AB
buf1 addr: 0xbfe0168d
buf1 val:  XY
user@linux:~/c# 

Unfortunately, when I add 3 char such as XYZ, I'm getting the following error message when compiling the program.

buff.c:8:2: warning: ‘__builtin_memcpy’ writing 4 bytes into a region of size 3 overflows the destination [-Wstringop-overflow=]
  strcpy(buf1,"XYZ");

Isn't XYZ considered as 3 bytes? Why does the error message said 4 bytes instead of 3 bytes

user@linux:~/c# cat buff.c; gcc buff.c -o buff; echo -e '\n'; ./buff
#include <stdio.h>
#include <string.h>

int main() {
        char buf1[3] = "AB";
        printf("buf1 val:  %s\n", buf1);
        printf("buf1 addr: %p\n", &buf1);
        strcpy(buf1,"XYZ");
        printf("buf1 val:  %s\n", buf1);
}buff.c: In function ‘main’:
buff.c:8:2: warning: ‘__builtin_memcpy’ writing 4 bytes into a region of size 3 overflows the destination [-Wstringop-overflow=]
  strcpy(buf1,"XYZ");
  ^~~~~~~~~~~~~~~~~~


buf1 val:  AB
buf1 addr: 0xbfdb34fd
buf1 val:  XYZ
Segmentation fault
user@linux:~/c# 

Solution

  • You're forgetting that C strings are null-terminated. The sizeof "AB" is 3 and sizeof "XYZ" is 4, due to the implicit terminating byte. (The type of the string literal "AB" is char[3] and the type of "XYZ" is char[4].)

    Had you not specified any length for buf1, it would also had been sized 3 bytes long:

    char buf1[] = "AB";  // here exactly the same as char buf1[3] = "AB";
    

    The memory layout would be

    buf1
      v
      +-------+-------+-------+
      |  [0]  |  [1]  |  [2]  |
      +-------+-------+-------+
      |  'A'  |  'B'  |  '\0' |
      +-------+-------+-------+
    

    Now, strcpy copies the terminating null character (C11 7.24.2.3p2):

    1. The strcpy function copies the string pointed to by s2 (including the terminating null character) into the array pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

    which means that 4 bytes are copied in total, but there are space for only 3 characters, therefore the code has undefined behaviour and the compiler produces the diagnostics messages. C11 7.1.4 Use of library functions p.2:

    [...] If a function argument is described as being an array, the pointer actually passed to the function shall have a value such that all address computations and accesses to objects (that would be valid if the pointer did point to the first element of such an array) are in fact valid.[...]

    In the actual code the implicit access to the buf1[3] is in fact not valid.

    Memory layout after strcpy:

    buf1
      v
      +-------+-------+-------+-------+
      |  [0]  |  [1]  |  [2]  |  ???  |
      +-------+-------+-------+-------+
      |  'X'  |  'Y'  |  'Z'  |  '\0' |
      +-------+-------+-------+-------+
    

    The reason why the warning comes from __builtin_memcpy is because the C compiler heavily optimized this code - it replaced the strcpy of a string of known length with memcpy of known length as memcpy would be generating more efficient code.


    And finally, you can fit 3 characters into char buf1[3]; by using strncpy, but the buffer cannot fit the terminating null character, and therefore it cannot be printed using printf("%s"), but you can print it with specifying explicit field width that is less than or equal to the length of the array - however the printed out value would be padded:

    #include <stdio.h>
    #include <string.h>
    
    int main() {
        char buf1[3] = "AB";
        printf("buf1 val:  >%-3s<\n", buf1);
        printf("buf1 addr: %p\n", &buf1);
        strncpy(buf1, "XYZ", 3);
        printf("buf1 val:  >%-3s<\n", buf1);
    }
    

    And compiling, running it:

    % gcc strncpy.c -Wall -Wextra
    % ./a.out
    buf1 val:  >AB <
    buf1 addr: 0x7ffd7f6aecc5
    buf1 val:  >XYZ<
    

    but there is one extra space character printed after AB