The system include 2 Advantech PCI-1716 card:http://www.advantech.com/products/PCI-1716/mod_86EC4C4D-F497-45C5-81DA-B8600C0EB36F.aspx
I write a program with NI LabVIEW 7.1. The LabVIEW program can control the two PCI 1716 card. It work very good.
But about a year later, The computer crashed and sometimes auto restart. Nobody change the software and hardware.
I used WinDbg to analysis the Windows crash dump file and I think I find something abnormal.
For the WinDbg result, I think maybe the PCI-1716 driver was broken so I re-installed it. But the problem also happen.
And I also re-installed my windows xp. The also problem happen again.
I don't know how.
Maybe the PCI 1716 hardware broken. But how to find which one, there are two PCI 1716 card.
The WinDbg result is:
Loading Dump File [F:\WinDBG\Mini012313-01.dmp] Mini Kernel Dump File: Only registers and stack trace are available
Symbol search path is: srv*C:\Documents and Settings\cky\symbols*http://msdl.microsoft.com/download/symbols Executable search path is: Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible Product: WinNt, suite: TerminalServer SingleUserTS Built by: 2600.xpsp_sp2_rtm.040803-2158 Kernel base = 0x804d8000 PsLoadedModuleList = 0x8055d700 Debug session time: Wed Jan 23 23:09:23.250 2013 (GMT+8) System Uptime: 0 days 23:47:48.802 Loading Kernel Symbols ....................................................................................................... Loading User Symbols Loading unloaded module list ....
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
Use !analyze -v to get detailed debugging information.
BugCheck 1000000A, {ffdf, 2, 1, 806e5a8e}
Unable to load image ADS1716S.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ADS1716S.sys
*** ERROR: Module load completed but symbols could not be loaded for ADS1716S.sys Probably caused by : ADS1716S.sys ( ADS1716S+f08 )
Followup: MachineOwner
---------
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
IRQL_NOT_LESS_OR_EQUAL (a) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high. This is usually caused by drivers using improper addresses. If a kernel debugger is available get the stack backtrace. Arguments: Arg1: 0000ffdf, memory referenced Arg2: 00000002, IRQL Arg3: 00000001, bitfield : bit 0 : value 0 = read operation, 1 = write operation bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status) Arg4: 806e5a8e, address which referenced memory
Debugging Details:
------------------
WRITE_ADDRESS: 0000ffdf
CURRENT_IRQL: 2
FAULTING_IP: hal!KeAcquireQueuedSpinLock+42 806e5a8e 8902 mov dword ptr [edx],eax
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0xA
PROCESS_NAME: LabVIEW.exe
LAST_CONTROL_TRANSFER: from 804efbb0 to 806e5a8e
STACK_TEXT: aa286c00 804efbb0 aa286c20 804f0de4 aa286c1c hal!KeAcquireQueuedSpinLock+0x42 aa286c08 804f0de4 aa286c1c 8602dc38 806e53b8 nt!IoAcquireCancelSpinLock+0xe aa286c20 804f104a 8602dc38 00000001 aa286c58 nt!IopStartNextPacket+0x18 aa286c30 f788ef08 8602dc38 00000001 85eb0230 nt!IoStartNextPacket+0x38 WARNING: Stack unwind information not available. Following frames may be wrong. aa286c58 80575529 85eb0230 86090da0 85f32448 ADS1716S+0xf08 aa286c80 805d1cb9 85f32448 85f32448 85f32690 nt!IoCancelThreadIo+0x33 aa286d08 805d209a 00000001 85f32448 00000000 nt!PspExitThread+0x403 aa286d28 805d2275 85f32448 00000001 aa286d64 nt!PspTerminateThreadByPointer+0x52 aa286d54 8054160c 00000000 00000001 0012e810 nt!NtTerminateProcess+0x105 aa286d54 7c92eb94 00000000 00000001 0012e810 nt!KiFastCallEntry+0xfc 0012e810 00000000 00000000 00000000 00000000 0x7c92eb94
STACK_COMMAND: kb
FOLLOWUP_IP: ADS1716S+f08 f788ef08 ?? ???
SYMBOL_STACK_INDEX: 4
SYMBOL_NAME: ADS1716S+f08
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: ADS1716S
IMAGE_NAME: ADS1716S.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 45b96de9
FAILURE_BUCKET_ID: 0xA_W_ADS1716S+f08
BUCKET_ID: 0xA_W_ADS1716S+f08
Followup: MachineOwner
---------
0: kd> lmvm ADS1716S start end module name f788e000 f7893ac0 ADS1716S T (no symbols)
Loaded symbol image file: ADS1716S.sys
Image path: ADS1716S.sys
Image name: ADS1716S.sys
Timestamp: Fri Jan 26 10:56:41 2007 (45B96DE9)
CheckSum: 00010C75
ImageSize: 00005AC0
Translations: 0000.04b0 0000.04e0 0409.04b0 0409.04e0
I'd run Memtest86+ just to see if you have any bad RAM.
Following that maybe you can pull 1 card out and write a simple script to do some I/O (similar to what you're doing normally maybe). If it takes a while you could put it into a loop and run it for a while and see if you can force a crash. You could repeat the test on each card to see if you have any issues with just one of the cards.
Does the crash happen often or rarely?