Search code examples
sql-serverxmlt-sqlprocedure

SELECT in XML Field is slow


I'm writing a SQL Server 2005 stored procedure but the select on xml is really slow.

This is my code:

INSERT INTO @T    
    SELECT DISTINCT 
        tr.ID, tr.Identificazione as identification,
        ft.NumeroFattura, ft.IDFattura,
        dbo.getInvoicePayedStatus(ft.IDFattura)
    FROM 
        TestReport AS tr
    INNER JOIN 
        Job AS j ON j.ID = tr.Job
    INNER JOIN 
        Plan_Main AS pm ON pm.TESTREPORT = tr.ID
    INNER JOIN 
        Fatture AS ft ON ft.IDFattura IN (SELECT T.N.value('(text())[1]','int') 
                                          FROM XMLDATA.nodes('InvoiceList/id') AS T(N))
WHERE 
    tr.DocumentStatus = 4 
    AND j.NomeCliente = @companyId 
    AND ft.IDFattura IS NOT NULL
ORDER BY 
    tr.id ASC

and this is the part that slows my stored procedure:

(SELECT T.N.value('(text())[1]','int') 
 FROM pm.XMLDATA.nodes('InvoiceList/id') AS T(N))

An example of the xml:

<PrecedentTask />
<UsedResources />
<InvoiceList>
    <id>4350</id>
</InvoiceList>

The number of row's is relative small after the "SQL Where" (about 30) but the query requires more than 8 seconds!

I also tried with the "=" operator instead "IN" and the performance is great (less than 1 second) but when my xml's "invoiceList" contains more than one row it doesn't work.

Do you have any solution ?


Solution

  • At the moment your query is returning matching rows between the following tables.

    TestReport as tr
    INNER JOIN Job       as j  
    on j.ID = tr.Job            -- Only matching rows betwwen "TestReport" & "Job"
    INNER JOIN Plan_Main as pm 
    on pm.TESTREPORT = tr.ID    -- Only matching rows betwwen "TestReport" & "Plan_Main"
    

    But when you join the above tables with the Fatture table with NO defined relation between any of the above mentioned tables and the Fatture table

    It actually corss joins the result set returned from the above joins and the Fatture table where the IDFattura is in the XML document.

    I would say your problem is the JOINS not the query shredding XML.

    You need to define a relationship in the ON clause when joining to Fatture table and filter out the rows using the same IN (subquery) in the ON clause or in the where clause.

    Something like...

    FROM TestReport as tr
    INNER JOIN Job       as j  on j.ID = tr.Job
    INNER JOIN Plan_Main as pm on pm.TESTREPORT = tr.ID
    INNER JOIN Fatture   as ft on ft.IDFattura =         --<-- A column to join it back to any of the above tables                          
    WHERE tr.DocumentStatus = 4 
      and j.NomeCliente = @companyId 
      and ft.IDFattura is not null
      and ft.IDFattura IN  (SELECT T.N.value('(text())[1]','int') 
                            FROM pm.XMLDATA.nodes('InvoiceList/id') AS T(N))
    ORDER BY tr.id ASC