I have a x64 crash dump of a managed (C#) application that p/invokes to native code. The dump was taken after the native code attempted to dereference a bad memory location, and after the .NET marshaler had turned it into an AccessViolationException
. As a result, the stack frame where the error occurred is no longer available, and the thread where the exception occurred is now hijacked by the CLR exception handler:
0:017> kb
# RetAddr : Args to Child : Call Site
00 000007fe`fd3b10dc : 00000000`0402958b 00000000`20000002 00000000`00000e54 00000000`00000e4c : ntdll!NtWaitForSingleObject+0xa
01 000007fe`ea9291eb : 00000000`00000000 00000000`00000cdc 00000000`00000000 00000000`00000cdc : KERNELBASE!WaitForSingleObjectEx+0x79
02 000007fe`ea929197 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : clr!CLREventWaitHelper2+0x38
03 000007fe`ea929120 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : clr!CLREventWaitHelper+0x1f
04 000007fe`ead8cae5 : 00000000`29cbc7c0 00000000`3213ce40 00000000`00000000 00000000`ffffffff : clr!CLREventBase::WaitEx+0x70
05 000007fe`ead8c9d0 : 00000000`29cbc7c0 00000000`00000000 00000000`0002b228 00000000`0002b228 : clr!Thread::WaitSuspendEventsHelper+0xf5
06 000007fe`eacf2145 : 00000000`007ea060 000007fe`ea924676 00000000`00000000 000007fe`fd3b18da : clr!Thread::WaitSuspendEvents+0x11
07 000007fe`eaccc00c : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : clr!Thread::RareEnablePreemptiveGC+0x33a905
08 000007fe`eae2c762 : 00000000`00000000 00000000`007cbce0 00000000`29cbc7c0 00000000`00000001 : clr!Thread::RareDisablePreemptiveGC+0x31b40c
09 000007fe`eaf662d4 : 00000000`00000000 00000000`007cbce0 00000000`29cbc7c0 00000000`00000000 : clr!EEDbgInterfaceImpl::DisablePreemptiveGC+0x22
0a 000007fe`eaf66103 : 00000000`29cb0100 00000000`00000000 00000000`3213cf80 00000000`29cbca20 : clr!Debugger::SendExceptionHelperAndBlock+0x174
0b 000007fe`eaf65d0d : ffffffff`ffffffff 00000000`29cbca20 00000000`29cbc700 000007fe`eaf62100 : clr!Debugger::SendExceptionEventsWorker+0x343
0c 000007fe`eaf61bd8 : 00000000`00000100 00000000`00000000 00000000`00000019 00000000`3213dd01 : clr!Debugger::SendException+0x15d
0d 000007fe`eadac75d : 00000000`007cbce0 00000000`3213d258 00000000`3213d1e8 00000000`00000001 : clr!Debugger::LastChanceManagedException+0x1f8
0e 000007fe`eaf698c7 : 000075ce`2b30e018 00000000`00000000 00000000`00000001 00000000`00000000 : clr!NotifyDebuggerLastChance+0x6d
0f 000007fe`eaf6af20 : 00000000`00000000 000007fe`8cf40020 000007fe`8cfa200c 4328fffe`43e0fffe : clr!Debugger::UnhandledHijackWorker+0x1a7
10 000007fe`eaaacbf0 : 00000000`0000000a 00000000`2ab23e30 00000000`00000001 00000000`00000000 : clr!ExceptionHijackWorker+0xc0
11 00000000`3213d8c0 : 00000000`3213ddb0 00000000`00000001 00000000`00000000 00000000`0000000b : clr!ExceptionHijack+0x30
12 00000000`3213ddb0 : 00000000`00000001 00000000`00000000 00000000`0000000b 00000000`0035578c : 0x3213d8c0
13 00000000`00000001 : 00000000`00000000 00000000`0000000b 00000000`0035578c ffffffff`00000002 : 0x3213ddb0
14 00000000`00000000 : 00000000`0000000b 00000000`0035578c ffffffff`00000002 00000000`00350268 : 0x1
And .exr -1
(display most recent exception) returns:
0:017> .exr -1
ExceptionAddress: 00000000771d685a (user32!ZwUserMessageCall+0x000000000000000a)
ExceptionCode: 80000004 (Single step exception)
ExceptionFlags: 00000000
NumberParameters: 0
The call to user32!ZwUserMessageCall
is at the top of the stack of thread 0, not 17 where the native exception occurred, so I can only assume it's not pointing to my exception.
I can dump the access violation exception to get some info about the native error:
0:017> !DumpObj /d 0000000012175640
Name: System.AccessViolationException
MethodTable: 000007fee9a61fe8
EEClass: 000007fee9528300
Size: 176(0xb0) bytes
File: C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
MT Field Offset Type VT Attr Value Name
000007fee9a50e08 4000002 8 System.String 0 instance 000000001217b538 _className
000007fee9a5b218 4000003 10 ...ection.MethodBase 0 instance 0000000000000000 _exceptionMethod
000007fee9a50e08 4000004 18 System.String 0 instance 0000000000000000 _exceptionMethodString
000007fee9a50e08 4000005 20 System.String 0 instance 0000000012179818 _message
000007fee9a61f18 4000006 28 ...tions.IDictionary 0 instance 0000000000000000 _data
000007fee9a51038 4000007 30 System.Exception 0 instance 0000000000000000 _innerException
000007fee9a50e08 4000008 38 System.String 0 instance 0000000000000000 _helpURL
000007fee9a513e8 4000009 40 System.Object 0 instance 0000000012179ad0 _stackTrace
000007fee9a513e8 400000a 48 System.Object 0 instance 0000000012179c68 _watsonBuckets
000007fee9a50e08 400000b 50 System.String 0 instance 0000000000000000 _stackTraceString
000007fee9a50e08 400000c 58 System.String 0 instance 0000000000000000 _remoteStackTraceString
000007fee9a53980 400000d 88 System.Int32 1 instance 0 _remoteStackIndex
000007fee9a513e8 400000e 60 System.Object 0 instance 0000000000000000 _dynamicMethods
000007fee9a53980 400000f 8c System.Int32 1 instance -2147467261 _HResult
000007fee9a50e08 4000010 68 System.String 0 instance 0000000000000000 _source
000007fee9a54a00 4000011 78 System.IntPtr 1 instance 0 _xptrs
000007fee9a53980 4000012 90 System.Int32 1 instance -532462766 _xcode
000007fee9a02d50 4000013 80 System.UIntPtr 1 instance 0 _ipForWatsonBuckets
000007fee9a3d210 4000014 70 ...ializationManager 0 instance 0000000012179900 _safeSerializationManager
000007fee9a513e8 4000001 0 System.Object 0 shared static s_EDILock
>> Domain:Value 00000000007e09b0:NotInit <<
000007fee9a54a00 400018a 98 System.IntPtr 1 instance 7fedad179f4 _ip
000007fee9a54a00 400018b a0 System.IntPtr 1 instance fffffffc2ab22078 _target
000007fee9a53980 400018c 94 System.Int32 1 instance 0 _accessType
From this I see the instruction address that failed (7fedad179f4
) and the address that the code tried to dereference (fffffffc2ab22078
). It appears to be a sign extension or overflow bug somehow, but it's not obvious in the code how that might have happened. The instruction referenced is:
0:017> u 7fedad179f4
MYDLL!_interpolate+0x174 [c:\my\source\file.c @ 85]:
000007fe`dad179f4 f3450f59548404 mulss xmm10,dword ptr [r12+rax*4+4]
To debug this further, I need the register context from when the native code crashed to see what was in r12
and rax
. Is this possible to retrieve?
Edit: I tried to get information about the parameters to ExceptionHijackWorker
, but the values don't make sense to me. The function signature according to @S.T.'s link is
void STDCALL ExceptionHijackWorker(T_CONTEXT * pContext,
EXCEPTION_RECORD * pRecord,
EHijackReason::EHijackReason reason,
void * pData);
So a first parameter of 0000000a
doesn't make sense as a pointer. And dumping the second parameter 000000002ab23e30
yields nonsensical data for the EXCEPTION_RECORD
:
0:017> dd 000000002ab23e30
00000000`2ab23e30 00000019 00000019 2ab23e40 00000000
00000000`2ab23e40 42b8f800 42b8de00 42b89b00 42b85000
00000000`2ab23e50 42b81b00 42b7a000 42b72600 42b6fa00
00000000`2ab23e60 42b6a000 42b67a00 42b63600 42b59c00
00000000`2ab23e70 42b4fc00 42b4da00 42b49e00 42b46a00
00000000`2ab23e80 42b38e00 42b31c00 42b2d600 42b29000
00000000`2ab23e90 42b2ec00 42b2fa00 42b2a000 42b27a00
00000000`2ab23ea0 42b23e00 42b6e800 42b6ab00 42b66c80
0x19
and 0x19
for the ExceptionCode
and ExceptionFlags
don't make sense; there is no code with that value and the flag is documented as being zero or EXCEPTION_NONCONTINUABLE
, which is defined as 1.
Am I misinterpreting anything here?
Following advice from @S.T., I started probing around the call stack to see if I could find an exception record or context record. I started around the strangeness at the bottom of the stack, namely:
0:017> k
# Child-SP RetAddr Call Site
...
0f 00000000`3213d210 000007fe`eaf6af20 clr!Debugger::UnhandledHijackWorker+0x1a7
10 00000000`3213d850 000007fe`eaaacbf0 clr!ExceptionHijackWorker+0xc0
11 00000000`3213d880 00000000`3213d8c0 clr!ExceptionHijack+0x30
12 00000000`3213d8a8 00000000`3213ddb0 0x3213d8c0
13 00000000`3213d8b0 00000000`00000001 0x3213ddb0
14 00000000`3213d8b8 00000000`00000000 0x1
I happened to find the exception record:
0:017> .exr 00000000`3213ddb0
ExceptionAddress: 000007fedad179f4 (SMTCV!_interpolate+0x0000000000000174)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: fffffffc2ab22078
Attempt to read from address fffffffc2ab22078
And then I happened to find the context record (what I was looking for):
0:017> .cxr 00000000`3213d8c0
rax=0000000000000019 rbx=000000000000000a rcx=00000000709c7c88
rdx=0000000000000002 rsi=000000002ab23e30 rdi=0000000080000000
rip=000007fedad179f4 rsp=000000003213dff0 rbp=0000000000000019
r8=000007ffffe22000 r9=0000000070910000 r10=0000000000000000
r11=000000003213e0a0 r12=fffffffc2ab22010 r13=000000002b50ae40
r14=000000002ab241ec r15=0000000000000003
iopl=0 nv up ei pl nz na pe nc
cs=0033 ss=002b ds=0000 es=0000 fs=0000 gs=0000 efl=00010200
MYDLL!_interpolate+0x174:
000007fe`dad179f4 f3450f59548404 mulss xmm10,dword ptr [r12+rax*4+4] ds:fffffffc`2ab22078=????????
I can see my bad pointer in r12
now!
I don't understand what these stack frames are, or why the exception and context records were stored as the return address for them. Any comments on this would be great, for me and for future readers.