Search code examples
cmallocgets

Gets and malloc in c


Somewhere on github I saw the following piece of code

char *p=malloc(1);
gets(p);
printf(p);

I tried the same and found out it works. No matter how long string I type it gets stored and doesn't give segmentation fault. How it works? I only gave it 1 byte.

Plus when I type free(p); it gives strange output.


Solution

  • That code is a textbook example of why gets has been removed from the standard library as of the 2011 standard. It is a malware exploit.

    gets reads a sequence of characters from standard input until it sees a newline character, and stores that sequence to the buffer starting at the address p. gets has no idea how large the target buffer is, and if the input sequence is longer than what the buffer is sized to hold, then gets will happily store those excess characters to the memory immediately following the buffer, potentially causing all kinds of mayhem.

    You allocate a buffer that's all of 1 byte wide. When you call gets, it writes the first character of input to that buffer, and then any additional input (plus a zero-valued terminator) to the unallocated heap memory immediately following your buffer.

    In this specific case, nothing important is being overwritten, so your code appears to function normally. In another context, however, this code may cause other data to be corrupted or cause a runtime error.

    The behavior on writing past the end of a buffer is undefined; the compiler doesn't have to warn you of anything, and the compiled code can do anything from crash outright to execute a virus to work as expected.

    So,

    1. NEVER NEVER NEVER NEVER NEVER use gets. Ever. Under any circumstances. Not even in toy code. Like I said, it is no longer part of the standard library.

    2. C puts all the burden of resource management on you, the programmer - buffers will not automatically grow to accommodate additional input, nor is there any automatic garbage collection to clean up dynamic memory that's no longer being referenced.

    3. C will not protect you from doing something stupid - the language assumes you know what you are doing at all times.