Search code examples
cgccmallocglibcmemset

malloc() and memset() behavior


I wrote some code to see how malloc() and memset() behave, and I found a case where I don't know what's going on.

I used malloc() to allocate 15 bytes of memory for a character array, and I wanted to see what would happen if I used memset() incorrectly to set 100 bytes of memory in the pointer I created. I expected to see that memset() had set 15 bytes (and possibly trash some other memory). What I'm seeing when I run the program is that it's setting 26 bytes of memory to the character that I coded.

Any idea why there are 26 bytes allocated for the pointer I created? I'm compiling with gcc and glibc. Here's the code:

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

#define ARRLEN 14

int main(void) {

    /* + 1 for the null terminator */
    char *charptr = malloc((sizeof(*charptr) * ARRLEN) + 1);
    if (!charptr)
        exit(EXIT_FAILURE);

    memset(charptr, '\0', (sizeof(*charptr) * ARRLEN) + 1);

    /* here's the intentionally incorrect call to memset() */
    memset(charptr, 'a', 100);

    printf("sizeof(char)   ------  %ld\n", sizeof(char));
    printf("sizeof(charptr)   ---  %ld\n", sizeof(charptr));
    printf("sizeof(*charptr)  ---  %ld\n", sizeof(*charptr));
    printf("sizeof(&charptr)  ---  %ld\n", sizeof(&charptr));
    printf("strlen(charptr)   ---  %ld\n", strlen(charptr));
    printf("charptr string   ----  >>%s<<\n", charptr);

    free(charptr);

    return 0;
}

This is the output I get:

sizeof(char)   ------  1
sizeof(charptr)   ---  8
sizeof(*charptr)  ---  1
sizeof(&charptr)  ---  8
strlen(charptr)   ---  26
charptr string   ----  >>aaaaaaaaaaaaaaaaaaaaaaaa<<

Solution

  • First of all, this is undefined behavior, so anything can happen; as said in a comment, on my machine I get your exact same behavior with optimizations disabled, but turning on optimizations I get a warning about a potential buffer overflow at compile time (impressive job gcc!) and a big crash at runtime. Even better, if I print it with a puts before the printf calls I get it printed with a different number of a.

    Still, I have the dubious luck to have the exact same behavior as you, so let's investigate. I compiled your program with no optimization and debug information

    [matteo@teokubuntu ~/scratch]$ gcc -g memset_test.c 
    

    then I fired up the debugger and added a breakpoint on the first printf, just after the memset.

    Reading symbols from a.out...done.
    (gdb) break 20
    Breakpoint 1 at 0x87e: file memset_test.c, line 20.
    (gdb) r
    Starting program: /home/matteo/scratch/a.out 
    
    Breakpoint 1, main () at memset_test.c:20
    20          printf("sizeof(char)   ------  %ld\n", sizeof(char));
    

    now we can set a hardware write breakpoint on the 26th memory location pointed by charptr

    (gdb) p charptr
    $1 = 0x555555756260 'a' <repeats 100 times>
    (gdb) watch charptr[26]
    Hardware watchpoint 2: charptr[26]
    

    ... and so...

    (gdb) c
    Continuing.
    
    Hardware watchpoint 2: charptr[26]
    
    Old value = 97 'a'
    New value = 0 '\000'
    _int_malloc (av=av@entry=0x7ffff7dcfc40 <main_arena>, bytes=bytes@entry=1024) at malloc.c:4100
    4100    malloc.c: File o directory non esistente.
    (gdb) bt
    #0  _int_malloc (av=av@entry=0x7ffff7dcfc40 <main_arena>, bytes=bytes@entry=1024) at malloc.c:4100
    #1  0x00007ffff7a7b0fc in __GI___libc_malloc (bytes=1024) at malloc.c:3057
    #2  0x00007ffff7a6218c in __GI__IO_file_doallocate (fp=0x7ffff7dd0760 <_IO_2_1_stdout_>) at filedoalloc.c:101
    #3  0x00007ffff7a72379 in __GI__IO_doallocbuf (fp=fp@entry=0x7ffff7dd0760 <_IO_2_1_stdout_>) at genops.c:365
    #4  0x00007ffff7a71498 in _IO_new_file_overflow (f=0x7ffff7dd0760 <_IO_2_1_stdout_>, ch=-1) at fileops.c:759
    #5  0x00007ffff7a6f9ed in _IO_new_file_xsputn (f=0x7ffff7dd0760 <_IO_2_1_stdout_>, data=<optimized out>, n=23)
        at fileops.c:1266
    #6  0x00007ffff7a3f534 in _IO_vfprintf_internal (s=0x7ffff7dd0760 <_IO_2_1_stdout_>, 
        format=0x5555555549c8 "sizeof(char)   ------  %ld\n", ap=ap@entry=0x7fffffffe330) at vfprintf.c:1328
    #7  0x00007ffff7a48f26 in __printf (format=<optimized out>) at printf.c:33
    #8  0x0000555555554894 in main () at memset_test.c:20
    (gdb) 
    

    So, it's just malloc code invoked (more or less indirectly) by printf doing its stuff on the memory block immediately adjacent to the one it gave you (possibly marking it as used).

    Long story short: you took memory that wasn't yours and it's now getting modified by its rightful owner at the first occasion when he needed it; nothing particularly strange or interesting.