Part of our product is an IE plugin (BHO), which is running happily in lots of different environments across multiple OS versions/IE versions.
However, in a trial setup for one customer, running XP SP3 machines via citrix XenDesktop, IE 7 is crashing when the two below conditions are met:
Some extra info:
Either plugin seems to work fine independently. Disabling flash, or skipping any sites that use flash prevent the crash.
The customer is reasonably accommodating, and I was able to run IE with the MS Debugging Tools in order to capture a few dumps at the time of the crash. I'm now having some trouble interpreting the dumps. Thinking it was heap corruption I ran the debugging tools with full pageheap enabled, but that did not trigger a breakpoint.
The analysis from the Debugging tools is as follows:
In iexplore_PID_5064_Date_12_20_2011__Time_11_19_26AM_161_Second_Chance_Exception_C0000005.dmp the assembly instruction at ole32!HandleIncomingCall+e2 in C:\WINDOWS\system32\ole32.dll from Microsoft Corporation has caused an access violation exception (0xC0000005) when trying to read from memory location 0x03ce4ff8 on thread
The stack trace at the point of crash is:
Thread 7 - System ID 1140
Entry point ieframe!CTabWindow::_TabWindowThreadProc
Create time 20/12/2011 19:18:08
Time spent in user mode 0 Days 0:0:19.828
Time spent in kernel mode 0 Days 0:0:10.468
Full Call Stack
Function Arg 1 Arg 2 Arg 3 Arg 4 Source
ole32!HandleIncomingCall+e2 0f9aafbc 00000034 00000001 07e8ab6c
ole32!STAInvoke+24 17444f80 00000001 0781efc0 077e8f10
ole32!AppInvoke+7e 17444f28 077e8f10 0781efc0 07e8ab6c
ole32!ComInvokeWithLockAndIPID+2c2 17444f28 077ec420 00000000 17444f28
ole32!ComInvoke+60 17444f28 00000400 0774ee30 07bcfe48
ole32!ThreadDispatch+23 17444f28 07bcfeb0 7752b096 00000000
ole32!ThreadWndProc+fe 005d0594 078b6ee0 0000babe 17444f2c
user32!InternalCallWinProc+28 7752b096 005d0594 00000400 0000babe
user32!UserCallWinProcCheckWow+150 00000000 7752b096 005d0594 00000400
user32!DispatchMessageWorker+306 7bcff64 00000000 07bcffb4 3e25e69b
user32!DispatchMessageW+f 07bcff64 0013e490 0013e5b8 07868ff0
ieframe!CTabWindow::_TabWindowThreadProc+189 07e03e30 0013e490 0013e5b8 07868ff0
kernel32!BaseThreadStart+37 3e25e464 07868ff0 00000000 00000000
I'm going to see what else I can get from this dump file, but I'm hoping someone here will have a great idea. I'd like to test a lot more stuff at the customer site, but we only have so many chances with them, so I need to use any time I get there very wisely.
For me a couple of next steps seem to be:
Sometimes the crash happens in pseuoserverinproc.dll, which is part of HDX MediaStream, which runs flash content locally rather than on the server.
== update
I've had quite a bit of success with WinDbg analysing the dumps that I have. I think it makes quite a bit of sense to try and use gflags/windbg on the desktop that is having the troubles and debug it live.
That would be my recommended next step to anyone in a similar position at the moment, will know more about how good this advice is an a weeks time when I've had a chance to apply it.
We solved the problem in the end (well worked around it). If anyone is interested, this is how we did it.
Analysing the stack dumps with WinDbg (which is a great tool). We found that after the problem was isolated to showing WinForms in iexplore.exe after flash had loaded in XenDesktop deployments. Knowing this we were able to work around the problem.
The key was getting good crash dumps, working out a minimal reproduction scenario and having a good customer that let us test our theory!