Search code examples
sql-serverxmlsqlxmlsql-server-2017

Retrieve specific fields of the imported to SQL Server multiple XMLs


I have multiple XML files with the same structure, I have imported them to SQL Server 2017 with the following commands:

DDL:

CREATE DATABASE xmlFiles
GO

USE xmlFiles
CREATE TABLE tblXMLFiles (IntCol int, XmlData xml);
GO

DML:

USE xmlFiles
INSERT INTO [dbo].[tblXMLFiles](XmlData) SELECT * FROM OPENROWSET(BULK 'C:\xmls\1.xml', SINGLE_BLOB) AS x;
INSERT INTO [dbo].[tblXMLFiles](XmlData) SELECT * FROM OPENROWSET(BULK 'C:\xmls\2.xml', SINGLE_BLOB) AS x;
…
INSERT INTO [dbo].[tblXMLFiles](XmlData) SELECT * FROM OPENROWSET(BULK 'C:\xmls\N.xml', SINGLE_BLOB) AS x;

Now I want to query the data:

USE xmlFiles
GO

DECLARE @XML AS XML, @hDoc AS INT, @SQL NVARCHAR (MAX)

SELECT @XML = XmLData FROM tblXMLFiles

EXEC sp_xml_preparedocument @hDoc OUTPUT, @XML

SELECT Surname , GivenNames
FROM OPENXML(@hDoc, 'article/ref-list/ref/mixed-citation')
WITH 
(
    Surname [varchar](100) 'string-name/surname',
    GivenNames [varchar](100) 'string-name/given-names'
)

EXEC sp_xml_removedocument @hDoc
GO

The query is working, but the problem is that it returns the data only when there is only one row in a data source table — tblXMLFiles. If I add more than one row, I get empty result set.

Important:
The situation is changing if I add to the outer SELECT clause (SELECT @XML = XmLData…) the TOP statement, then it returns the queried data of the specific row number, according to the TOP value.

How can I retrieve the data not only when there is one line in the table, but many rows?


Solution

  • FROM OPENXML with the corresponding SPs to prepare and to remove a document is outdated and should not be used any more. Rather use the appropriate methods the XML data type provides.

    Without an example of your XML it is quite difficult to offer a solution, but my magic crystal ball tells me, that it might be something like this:

    SELECT f.IntCol 
          ,mc.value('(string-name/surname)[1]','nvarchar(max)') AS Surname
          ,mc.value('(string-name/given-names)[1]','nvarchar(max)') AS GivenNames
    FROM dbo.tblXMLFiles AS f
    OUTER APPLY f.XmlData.nodes('article/ref-list/ref/mixed-citation') AS A(mc)