Search code examples
c++cdebuggingvisual-c++

Why do I get _CrtIsValidHeapPointer(block) and/or is_block_type_valid(header->_block_use) assertions?


When I run my compiled programs in with VisualStudio debug-mode, sometimes I get

Debug assertion failed! Expression: _CrtIsValidHeapPointer(block)

or

Debug assertion failed! Expression: is_block_type_valid(header->_block_use)

(or both after each other) assertions.

What does it mean? How can I find and fix the origin of such problems?


Solution

  • These assertions show that either the pointer, which should be freed is not (or no longer) valid (_CrtIsValidHeapPointer-assertion) or that the heap was corrupted at some point during the run of the program (is_block_type_valid(header->_block_use)-assertion aka _Block_Type_Is_Valid (pHead->nBlockUse)-assertion in earlier versions).

    When acquiring memory from the heap, functions malloc/free don't communicate directly with the OS, but with a memory manager, which is usually provided by the corresponding C-runtime. VisualStudio/Windows SDK provide a special heap-memory manager for debug-builds, which performs additional sanity checks during the run time.

    _CrtIsValidHeapPointer is just a heuristic, but there are enough cases of invalid pointers, for which this function can report a problem.

    1. When does _CrtIsValidHeapPointer-assertion fire?

    There are some of the most usual scenarios:

    A. Pointer doesn't point to a memory from the heap to begin with:

    char *mem = "not on the heap!";
    free(mem); 
    

    here the literal isn't stored on the heap and thus can/should not be freed.

    B. The value of the pointer isn't the original address returned by malloc/calloc:

    unsigned char *mem = (unsigned char*)malloc(100);
    mem++;
    free(mem); // mem has wrong address!
    

    As value of mem is no longer 64byte aligned after the increment, the sanity check can easily see that it cannot be a heap-pointer!

    A slightly more complex, but not unusual C++-example (mismatch new[] and delete):

    struct A {
        int a = 0;
        ~A() {// destructor is not trivial!
             std::cout << a << "\n";
        }
    };
    A *mem = new A[10];
    delete mem;
    

    When new A[n] is called, actually sizeof(size_t)+n*sizeof(A)bytes memory are allocated via malloc (when the destructor of the class A is not trivial), the number of elements in array is saved at the beginning of the allocated memory and the returned pointer mem points not to the original address returned by malloc, but to address+offset (sizeof(size_t)). However, delete knows nothing about this offset and tries to delete the pointer with wrong address (delete [] would do the right thing).

    C. double-free:

    unsigned char *mem = (unsigned char*)malloc(10);
    free(mem);
    free(mem);  # the pointer is already freed
    

    A very common reason in C++ that rule of three/five isn't adhered to, e.g:

    struct A {// bad: doesn't adhere to rule of three
        int* ptr;
        A(int i): ptr(new int(i)){}
        ~A() { delete ptr; }
    };
    
    {
      A a(0);
      A b = a; // a and b share pointer: a.ptr == b.ptr
    } // here destructors of b and a called => problem
    //  at first b.ptr gets deleted
    //  deleting (already deleted) a.ptr leads now to UB/error.
    

    D. pointer from another runtime/memory manager

    Windows programs have the ability to use multiple runtimes at once: every used dll could potentially have its own runtime/memory manager/heap, because it was linked statically or because they have different versions. Thus, a memory allocated in one dll, could fail when freed in another dll, which uses a different heap (see for example this SO-question or this SO-question).

    2. When does is_block_type_valid(header->_block_use)-assertion fire?

    In the above cases A. and B., in addition also is_block_type_valid(header->_block_use) will fire. After _CrtIsValidHeapPointer-assertion, the free-function (more precise free_dbg_nolock) looks for info in the block-header (a special data structure used by debug-heap, more information about it later on) and checks that the block type is valid. However, because the pointer is completely bogus, the place in the memory, where nBlockUse is expected to be, is some random value.

    However, there are some scenarios, when is_block_type_valid(header->_block_use) fires without previous _CrtIsValidHeapPointer-assertion.

    A. _CrtIsValidHeapPointer doesn't detect invalid pointer

    Here is an example:

    unsigned char *mem = (unsigned char*)malloc(100);
    mem+=64;
    free(mem);
    

    Because debug-heap fills the allocated memory with 0xCD, we can be sure that accessing nBlockUse will yield a wrong type, thus leading to the above assertion.

    B. Corruption of the heap

    Most of the time, when is_block_type_valid(header->_block_use) fires without _CrtIsValidHeapPointer it means, that the heap was corrupted due to some out-of-range writes.

    So if we "delicate" (and don't overwrite "no man's land"-more on that later):

    unsigned char *mem = (unsigned char*)malloc(100);
    *(mem-17)=64; // thrashes _block_use.
    free(mem);
    

    leads only to is_block_type_valid(header->_block_use).


    In all above cases, it is possible to find the underlying issue by following memory allocations, but knowing more about the structure of debug-heap helps a lot.

    An overview about debug-heap can be found e.g. in documentation, alternatively all details of the implementation can be found in the corresponding Windows Kit,(e.g. C:\Program Files (x86)\Windows Kits\10\Source\10.0.16299.0\ucrt\heap\debug_heap.cpp).

    In a nutshell: When a memory is allocated on a debug heap, more memory than needed is allocated, so additional structures such as "no man's land" and additional info, such as _block_use, can be stored next to the "real" memory. The actual memory layout is:

    ------------------------------------------------------------------------
    | header of the block + no man's land |  "real" memory | no man's land |
    ----------------------------------------------------------------------
    |    32 bytes         +      4bytes   |     ? bytes    |     4 bytes   |
    ------------------------------------------------------------------------
    

    Every byte in "no man's land" at the end and at the beginning are set to a special value (0xFD), so once it is overwritten we can register out-of-bounds write access (as long as they are at most 4 bytes off).

    For example in the case of new[]-delete-mismatch we can analyze memory before the pointer, to see whether this is no man's land or not (here as code, but normally done in debugger):

    
    A *mem = new A[10];
    ...
    // instead of
    //delete mem;
    // investigate memory:
    unsigned char* ch = reinterpret_cast<unsigned char*>(mem);
    for (int i = 0; i < 16; i++) {
        std::cout << (int)(*(ch - i)) << " ";
    }
    

    we get:

    0 0 0 0 0 0 0 0 10 253 253 253 253 0 0 52
    

    i.e. the first 8 bytes are used for the number of elements (10), than we see "no man's land" (0xFD=253) and then other information. It is easy to see, what is going wrong - if the pointer where correct, the first 4 values where 253.

    When Debug-heap frees memory it overwrites it with a special byte value: 0xDD, i.e. 221. One also can restrict the reuse of once used and freed memory by setting flag _CRTDBG_DELAY_FREE_MEM_DF, thus the memory stays marked not only directly after the free-call, but during the whole run of the program. So when we try to free the same pointer a second time, debug-heap can see, taht the memory was already freed once and fire the assertion.

    Thus, it is also easy to see, that the problem is a double-free, by analyzing the values around pointer:

    unsigned char *mem = (unsigned char*)malloc(10);
    free(mem);
    for (int i = 0; i < 16; i++) {
        printf("%d ", (int)(*(mem - i)));
    }
    free(mem); //second free
    

    prints

    221 221 221 221 221 221 221 221 221 221 221 221 221 221 221 221
    

    the memory, i.e. the memory was already freed once.

    On the detection of heap-corruption:

    The purpose of no-man's land is to detect out-of-range writes, this however works only for being off for 4 bytes in either direction, e.g.:

    unsigned char *mem = (unsigned char*)malloc(100);
    *(mem-1)=64; // thrashes no-man's land
    free(mem);
    

    leads to

    HEAP CORRUPTION DETECTED: before Normal block (#13266) at 0x0000025C6CC21050.
    CRT detected that the application wrote to memory before start of heap buffer.
    

    A good way to find heap corruption is to use _CrtSetDbgFlag(_CRTDBG_CHECK_ALWAYS_DF) or ASSERT(_CrtCheckMemory());(see this SO-post). However, this is somewhat indirect - a more direct way it to use gflags as explained in this SO-post (it is not unusual that gflags needs about 30 times more memory and is about 10 times slower).


    Btw, the definition of _CrtMemBlockHeader changed over the time and no longer the one shown in online-help, but:

    struct _CrtMemBlockHeader
    {
        _CrtMemBlockHeader* _block_header_next;
        _CrtMemBlockHeader* _block_header_prev;
        char const*         _file_name;
        int                 _line_number;
        
        int                 _block_use;
        size_t              _data_size;
        
        long                _request_number;
        unsigned char       _gap[no_mans_land_size];
    
        // Followed by:
        // unsigned char    _data[_data_size];
        // unsigned char    _another_gap[no_mans_land_size];
    };