Search code examples
cglobal-variablesfreeallocationextern

Calling free on a pointer to an extern variable in C


I would like to know the behavior of a C program calling free on a pointer to an extern variable. The background is that I'm a developer of a verifier analyzing C code and I wonder what my verifier should do if it encounters such a situation (e.g., say why the program is undefined - if it is).

To find out the behavior experimentally, I tried to run the following C program:

#include <stdlib.h>

extern int g = 1;

int main() {
    int *ptr = &g;
    free(ptr);
    return g;
}

On a Debian GNU/Linux 7 system, this program crashes with an error message indicating that the pointer passed to free is invalid. On a Windows 7 system, I could run this program without any error message. Would you know of an explanation for this observation?

UPDATE I did read the definition of free. My question aims at whether this definition actually rules out the possibility that such a program might reliably work on a standard-complying system (and not just by "it can do anything if the behavior is undefined"). So I would like to know if you could think of a configuration/system/whatever where this program does not expose undefined behavior. In other words: Are there conditions under which the call to free here would be defined properly according to the C standard?


Solution

  • The C standard is unambigous about this. Quoting document N1570, the closest approximation to C11 available online at no charge, section 7.22.3.3 para 2 (the specification of free):

    The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if the argument does not match a pointer earlier returned by a memory management function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.

    "Memory management functions" are listed at the beginning of 7.22.3: malloc, calloc, realloc, and aligned_alloc. (An implementation could add more such functions, e.g. posix_memalign -- read the notes at the bottom!)

    Now, "the behavior is undefined" licenses an implementation to do anything when the situation occurs. Crashing is common, but MSVC's runtime library is perfectly entitled to detect that a pointer is outside the "heap" and do nothing. Experiment with debugging modes: there's probably a mode where it will crash the program instead.

    As the author of a code-verifying tool, you should be maximally strict: if you can't prove that a pointer passed to free is either NULL or a value previously returned by a memory management function, flag that as an error.


    Addendum: the somewhat confusing "or if the space has been deallocated..." clause is intended to prohibit double deallocation:

    char *x = malloc(42);
    free(x); // ok
    free(x); // undefined behavior
    

    ... but beware of memory reuse:

    char *x = malloc(42);
    uintptr_t a = (uintptr_t)x;
    free(x);
    x = malloc(42);
    uintptr_t b = (uintptr_t)x;
    
    observe(a == b); // un*specified* behavior - must be either true or false,
                     // but no guarantee which
    free(x); // ok regardless of whether a == b
    

    Double addendum:

    Are there conditions under which the call to free here would be defined properly according to the C standard?

    No. If there were such a condition, it would have to appear in the text of the standard as an exception to the rule I quoted at the beginning of this answer, and there aren't any such exceptions.

    However, there is a subtle variation to which the answer is 'yes':

    Could there be an implementation of C under which the behavior of the program shown is always well-defined?

    For instance, an implementation in which free is documented to do nothing, regardless of its input, would qualify and is not even a crazy idea—many programs can get away with never calling free, after all. But the behavior of the program is still undefined according to the C standard; it is just that the C implementation has chosen to make this particular UB scenario well-defined itself.

    (From the language-lawyer perspective, every implementation extension to the language is a case of the implementation making an UB scenario well-defined. Even nigh-ubiquitous things like #include <unistd.h>.)