Search code examples
windbglabview

Sometimes Windows Crash while LabVIEW operator Advantech PCI-1716 card


The system include 2 Advantech PCI-1716 card:http://www.advantech.com/products/PCI-1716/mod_86EC4C4D-F497-45C5-81DA-B8600C0EB36F.aspx

I write a program with NI LabVIEW 7.1. The LabVIEW program can control the two PCI 1716 card. It work very good.

But about a year later, The computer crashed and sometimes auto restart. Nobody change the software and hardware.

I used WinDbg to analysis the Windows crash dump file and I think I find something abnormal.

For the WinDbg result, I think maybe the PCI-1716 driver was broken so I re-installed it. But the problem also happen.

And I also re-installed my windows xp. The also problem happen again.

I don't know how.

Maybe the PCI 1716 hardware broken. But how to find which one, there are two PCI 1716 card.

The WinDbg result is:

Loading Dump File [F:\WinDBG\Mini012313-01.dmp] Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: srv*C:\Documents and Settings\cky\symbols*http://msdl.microsoft.com/download/symbols Executable search path is:  Windows XP Kernel Version 2600 (Service Pack 2) MP (2 procs) Free x86 compatible Product: WinNt, suite: TerminalServer SingleUserTS Built by: 2600.xpsp_sp2_rtm.040803-2158 Kernel base = 0x804d8000 PsLoadedModuleList = 0x8055d700 Debug session time: Wed Jan 23 23:09:23.250 2013 (GMT+8) System Uptime: 0 days 23:47:48.802 Loading Kernel Symbols ....................................................................................................... Loading User Symbols Loading unloaded module list ....
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 1000000A, {ffdf, 2, 1, 806e5a8e}

Unable to load image ADS1716S.sys, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ADS1716S.sys
*** ERROR: Module load completed but symbols could not be loaded for ADS1716S.sys Probably caused by : ADS1716S.sys ( ADS1716S+f08 )

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

IRQL_NOT_LESS_OR_EQUAL (a) An attempt was made to access a pageable (or completely invalid) address at an interrupt request level (IRQL) that is too high.  This is usually caused by drivers using improper addresses. If a kernel debugger is available get the stack backtrace. Arguments: Arg1: 0000ffdf, memory referenced Arg2: 00000002, IRQL Arg3: 00000001, bitfield :     bit 0 : value 0 = read operation, 1 = write operation   bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status) Arg4: 806e5a8e, address which referenced memory

Debugging Details:
------------------


WRITE_ADDRESS:  0000ffdf 

CURRENT_IRQL:  2

FAULTING_IP:  hal!KeAcquireQueuedSpinLock+42 806e5a8e 8902            mov     dword ptr [edx],eax

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  DRIVER_FAULT

BUGCHECK_STR:  0xA

PROCESS_NAME:  LabVIEW.exe

LAST_CONTROL_TRANSFER:  from 804efbb0 to 806e5a8e

STACK_TEXT:   aa286c00 804efbb0 aa286c20 804f0de4 aa286c1c hal!KeAcquireQueuedSpinLock+0x42 aa286c08 804f0de4 aa286c1c 8602dc38 806e53b8 nt!IoAcquireCancelSpinLock+0xe aa286c20 804f104a 8602dc38 00000001 aa286c58 nt!IopStartNextPacket+0x18 aa286c30 f788ef08 8602dc38 00000001 85eb0230 nt!IoStartNextPacket+0x38 WARNING: Stack unwind information not available. Following frames may be wrong. aa286c58 80575529 85eb0230 86090da0 85f32448 ADS1716S+0xf08 aa286c80 805d1cb9 85f32448 85f32448 85f32690 nt!IoCancelThreadIo+0x33 aa286d08 805d209a 00000001 85f32448 00000000 nt!PspExitThread+0x403 aa286d28 805d2275 85f32448 00000001 aa286d64 nt!PspTerminateThreadByPointer+0x52 aa286d54 8054160c 00000000 00000001 0012e810 nt!NtTerminateProcess+0x105 aa286d54 7c92eb94 00000000 00000001 0012e810 nt!KiFastCallEntry+0xfc 0012e810 00000000 00000000 00000000 00000000 0x7c92eb94


STACK_COMMAND:  kb

FOLLOWUP_IP:  ADS1716S+f08 f788ef08 ??              ???

SYMBOL_STACK_INDEX:  4

SYMBOL_NAME:  ADS1716S+f08

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: ADS1716S

IMAGE_NAME:  ADS1716S.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  45b96de9

FAILURE_BUCKET_ID:  0xA_W_ADS1716S+f08

BUCKET_ID:  0xA_W_ADS1716S+f08

Followup: MachineOwner
---------

0: kd> lmvm ADS1716S start    end        module name f788e000 f7893ac0 ADS1716S T (no symbols)           
    Loaded symbol image file: ADS1716S.sys
    Image path: ADS1716S.sys
    Image name: ADS1716S.sys
    Timestamp:        Fri Jan 26 10:56:41 2007 (45B96DE9)
    CheckSum:         00010C75
    ImageSize:        00005AC0
    Translations:     0000.04b0 0000.04e0 0409.04b0 0409.04e0

Solution

  • I'd run Memtest86+ just to see if you have any bad RAM.

    Following that maybe you can pull 1 card out and write a simple script to do some I/O (similar to what you're doing normally maybe). If it takes a while you could put it into a loop and run it for a while and see if you can force a crash. You could repeat the test on each card to see if you have any issues with just one of the cards.

    Does the crash happen often or rarely?