Search code examples
deadlockdynamics-ax-2012

Dead Lock With Multiple AOS (Dynamics AX 2012 R2)


We are using Microsoft Dynamics AX 2012 R2 and have multiple AOS servers and in some cases we experience database dead lock on a specific table. We are using SQL Dead Lock Detector for investigation.

AOS1 psid XX (Delete) -><- AOS2 psid XX (Update)

where both are accessing table BATCH. AOS1 is doing Delete operation while AOS2 is doing an update operation.

I am not really technical with Dynamics AX but I would like to ask if this is a configuration setting or what are our options cause I know that you could have multiple AOS with Dynamics AX? For now, we are thinking of using a single AOS to avoid this. TIA!

UPDATE: Here is the information extracted from SQL Dead Lock Detector

<EVENT_INSTANCE>
  <EventType>DEADLOCK_GRAPH</EventType> 
  <PostTime>2016-05-18T08:23:26.457</PostTime> 
  <SPID>31</SPID> 
- <TextData>
- <deadlock-list>
- <deadlock victim="process187b921088">
- <process-list>
- <process id="process187b921088" taskpriority="0" logused="0" waitresource="KEY: 5:72057594054180864 (1012d480fedd)" waittime="1388" ownerId="7211112518" transactionname="user_transaction" lasttranstarted="2016-05-18T08:23:25.067" XDES="0x1561eaf0a0" lockMode="S" schedulerid="18" kpid="4772" status="suspended" spid="151" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2016-05-18T08:23:25.070" lastbatchcompleted="2016-05-18T08:23:25.070" lastattention="2016-05-16T15:37:08.080" clientapp="Microsoft Dynamics AX" hostname="XXX-PRODAOS3" hostpid="3796" loginname="XXX\axaos" isolationlevel="read committed (2)" xactid="7211112518" currentdb="5" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
- <executionStack>
  <frame procname="adhoc" line="1" stmtstart="212" stmtend="1406" sqlhandle="0x0200000068d839079e1e808253171c305e0cb77c0e699b800000000000000000000000000000000000000000">unknown</frame> 
  </executionStack>
  <inputbuf>(@P1 int,@P2 datetime2,@P3 int,@P4 int,@P5 int,@P6 int,@P7 int,@P8 int,@P9 int,@P10 int,@P11 int,@P12 int)UPDATE BATCH SET STATUS=@P1,MODIFIEDDATETIME=@P2,RECVERSION=@P3 WHERE ((STATUS=@P4) AND (CONSTRAINTTYPE=@P5)) AND EXISTS (SELECT 'x' FROM BATCHJOB T2 WITH ( READCOMMITTEDLOCK) WHERE ((T2.STATUS=@P6) AND (BATCH.BATCHJOBID=T2.RECID)) AND NOT (EXISTS (SELECT 'x' FROM BATCHCONSTRAINTS T3 WITH ( READCOMMITTEDLOCK) WHERE EXISTS (SELECT 'x' FROM BATCH T4 WITH ( READCOMMITTEDLOCK) WHERE (((T3.DEPENDSONBATCHID=T4.RECID) AND (T3.BATCHID=BATCH.RECID)) AND ((((T4.STATUS<>@P7) AND (T4.STATUS<>@P8)) OR ((T3.EXPECTEDSTATUS=@P9) AND (T4.STATUS=@P10))) OR ((T3.EXPECTEDSTATUS=@P11) AND (T4.STATUS=@P12))))))))</inputbuf> 
  </process>
- <process id="process187b911848" taskpriority="0" logused="1496" waitresource="KEY: 5:72057594054967296 (1f6c1d9698d0)" waittime="1339" ownerId="7211110381" transactionname="user_transaction" lasttranstarted="2016-05-18T08:23:25.050" XDES="0x187b9610d0" lockMode="U" schedulerid="16" kpid="2028" status="suspended" spid="87" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2016-05-18T08:23:25.117" lastbatchcompleted="2016-05-18T08:23:25.117" lastattention="1900-01-01T00:00:00.117" clientapp="Microsoft Dynamics AX" hostname="XXX-PRODAOS2" hostpid="3292" loginname="XXX\axaos" isolationlevel="read committed (2)" xactid="7211110381" currentdb="5" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056">
- <executionStack>
  <frame procname="adhoc" line="1" stmtstart="50" stmtend="448" sqlhandle="0x020000007a68dc3969bbab373f6d3381cb08da5d66262e300000000000000000000000000000000000000000">unknown</frame> 
  </executionStack>
  <inputbuf>(@P1 int,@P2 int,@P3 int)DELETE FROM BATCH WHERE EXISTS (SELECT 'x' FROM BATCHJOB T2 WITH ( READCOMMITTEDLOCK) WHERE (((T2.FINISHING=@P1) AND (BATCH.BATCHJOBID=T2.RECID)) AND ((BATCH.RUNTIMETASK=@P2) OR (T2.RUNTIMEJOB=@P3))))</inputbuf> 
  </process>
  </process-list>
- <resource-list>
- <keylock hobtid="72057594054180864" dbid="5" objectname="TPI_AxNew_Prod.dbo.BATCHJOB" indexname="I_2096RECID" id="lockf9d3e0580" mode="X" associatedObjectId="72057594054180864">
- <owner-list>
  <owner id="process187b911848" mode="X" /> 
  </owner-list>
- <waiter-list>
  <waiter id="process187b921088" mode="S" requestType="wait" /> 
  </waiter-list>
  </keylock>
- <keylock hobtid="72057594054967296" dbid="5" objectname="TPI_AxNew_Prod.dbo.BATCH" indexname="I_2827BATCHJOBID" id="lock15d81c0a80" mode="U" associatedObjectId="72057594054967296">
- <owner-list>
  <owner id="process187b921088" mode="U" /> 
  </owner-list>
- <waiter-list>
  <waiter id="process187b911848" mode="U" requestType="wait" /> 
  </waiter-list>
  </keylock>
  </resource-list>
  </deadlock>
  </deadlock-list>
  </TextData>
  <TransactionID /> 
  <LoginName>sa</LoginName> 
  <StartTime>2016-05-18T08:23:26.457</StartTime> 
  <ServerName>XXX-AXDB2</ServerName> 
  <LoginSid>AQ==</LoginSid> 
  <EventSequence>546028</EventSequence> 
  <IsSystem>1</IsSystem> 
  <SessionLoginName /> 
  </EVENT_INSTANCE>

Solution

  • The information from the SQL Deadlock Detector tells us that we are looking for two SQL statements of the following structure in the x++ code:

    UPDATE BATCH 
      SET STATUS=@P1,MODIFIEDDATETIME=@P2,RECVERSION=@P3 
      WHERE ((STATUS=@P4) AND (CONSTRAINTTYPE=@P5)) AND EXISTS 
        (SELECT 'x' FROM BATCHJOB T2 WITH ( READCOMMITTEDLOCK) 
          WHERE ((T2.STATUS=@P6) AND (BATCH.BATCHJOBID=T2.RECID)) AND NOT (EXISTS 
            (SELECT 'x' FROM BATCHCONSTRAINTS T3 WITH ( READCOMMITTEDLOCK) 
              WHERE EXISTS 
                (SELECT 'x' FROM BATCH T4 WITH ( READCOMMITTEDLOCK) 
                  WHERE (((T3.DEPENDSONBATCHID=T4.RECID) AND (T3.BATCHID=BATCH.RECID)) AND ((((T4.STATUS<>@P7) AND (T4.STATUS<>@P8)) OR ((T3.EXPECTEDSTATUS=@P9) AND (T4.STATUS=@P10))) OR ((T3.EXPECTEDSTATUS=@P11) AND (T4.STATUS=@P12)))))))) 
    
    
    DELETE FROM BATCH WHERE EXISTS 
      (SELECT 'x' FROM BATCHJOB T2 WITH ( READCOMMITTEDLOCK) 
        WHERE (((T2.FINISHING=@P1) AND (BATCH.BATCHJOBID=T2.RECID)) AND ((BATCH.RUNTIMETASK=@P2) OR (T2.RUNTIMEJOB=@P3))))
    

    I was able to find these in class BatchRun in method serverProcessDependencies (for the update) and method serverProcessFinishedJobs (for the delete). This tells us that this is caused by a batch job that uses multithreading. There are several out-of-the-box batch jobs that use multithreading, one of the most commonly used is the master scheduling batch job. But I suspect that your problem is caused by a custom implementation of a multithreading batch job.

    As suggested in the comments, take a look at the batch jobs that are running when the deadlock occurs and analyze if they use multithreading and if they are custom batch jobs. You will need an experienced developer to do this analysis.

    A temporary solution may be what Vincent16 proposes in Deadlock issue when running job in batch franework, which is to set the "Maximum batch threads" setting of the batch server to 1 and only allow one batch server. But this may severely impact the performance of batch jobs, so make sure to test this before using it in production.