Search code examples
windowswindbgheap-corruption

How should I interpret WinDbg heap verification results?


In WinDbg, I've executed the command !heap -s -v on seven different heap-corruption-induced crash dumps and all have these results:

..................List corrupted: (Blink->Flink = 0000000000000000) != (Block = 00000000026d0010)
HEAP 0000000002030000 (Seg 00000000026d0000) At 00000000026d0000 Error: block list entry corrupted

HEAP 0000000002030000 (Seg 00000000026d0000) At 00000000026d0000 Error: SegmentIndex field corrupted

HEAP 0000000002030000 (Seg 00000000026d0000) At 0000000002749400 Error: invalid block size

How should I interpret those results?

I interpreted (Seg 00000000026d0000) to mean that WinDbg thinks that it is a segment (_HEAP_SEGMENT), but it is actually the address of the large block cache (this is consistent with every dump):

+0x2b8 LargeBlocksIndex : 0x00000000`026d0000 Void

I've verified that a dump made from the same operating system and the same process does not exhibit any problems with a WinDbg verification until the crash occurs.

In short, I don't know why WinDbg is complaining about the 26d0000 address or why it might be interpreting it as a segment (if that's even what it's doing).

All dumps were from a Windows 2003 R2 machine. The process is 64-bit.


Solution

  • It turns out that the particular crash I was dealing with was the result of a limit on the amount of data inside of the Windows 2003 heap segments (~106 GiB). The memory was becoming too fragmented and the program wasn't able to find room inside of the segments for an allocation just under 1 MiB in size. I had initially ruled this out because of the amount of physical RAM on the machine (192 GiB) and because of the per-process limit on RAM usage (8 TiB).

    I have yet to figure out the reason for WinDbg's results, but I am willing to disregard it as a likely false positive from the heap being in an inconsistent state with the code paths that are executed when the heap exhausts the segment memory.