Search code examples
sql-serverxmlxpathxquerysql-server-openxml

Parsing XML file with inline schema using OPENXML in SQL Server


I am trying to extract data from XML file and store in to SQL table using OPENXML in SQL Server. But, the query returns nothing.

XML data

<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsd:schema targetNamespace="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified">
    <xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd"/>
    <xsd:element name="ogridroles">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="ogrid_cde" type="sqltypes:int" nillable="1"/>
                <xsd:element name="role" nillable="1">
                    <xsd:simpleType>
                        <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                            <xsd:maxLength value="1"/>
                        </xsd:restriction>
                    </xsd:simpleType>
                </xsd:element>
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>
</xsd:schema>
<ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
    <ogrid_cde>28</ogrid_cde>
    <role>T</role>
</ogridroles>
<ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
    <ogrid_cde>75</ogrid_cde>
    <role>T</role>
</ogridroles>
<ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
    <ogrid_cde>93</ogrid_cde>
    <role>O</role>
</ogridroles>
<ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
    <ogrid_cde>135</ogrid_cde>
    <role>O</role>
</ogridroles>
</root>

SQL query

DECLARE @xmlStr xml;
DECLARE @idoc INT

SELECT @xmlStr = BulkColumn FROM OPENROWSET(
   BULK 'D:\ogridroles.xml',
   SINGLE_BLOB) AS DATA;

EXEC sp_xml_preparedocument @idoc OUTPUT, @xmlStr;
SELECT *
FROM OPENXML (@idoc, '/root/ogridroles',2)
    WITH (
        ogrid_cde   int 'ogrid_cde',
        role varchar(1) 'role')
EXEC sp_xml_removedocument @idoc

I want to extract all row values of the elements

ogrid_cde
role


Solution

  • The approach via FROM OPENXML (together with the procedures to prepare and to remove a document) is out-dated and should not be used anymore.

    Much better are the native methods of the XML-type. Try this:

    Your XML

    DECLARE @xml XML=
    N'<root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <xsd:schema xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" targetNamespace="urn:schemas-microsoft-com:sql:SqlRowSet1" elementFormDefault="qualified">
        <xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" />
        <xsd:element name="ogridroles">
          <xsd:complexType>
            <xsd:sequence>
              <xsd:element name="ogrid_cde" type="sqltypes:int" nillable="1" />
              <xsd:element name="role" nillable="1">
                <xsd:simpleType>
                  <xsd:restriction base="sqltypes:char" sqltypes:localeId="1033" sqltypes:sqlCompareOptions="IgnoreCase IgnoreKanaType IgnoreWidth" sqltypes:sqlSortId="52">
                    <xsd:maxLength value="1" />
                  </xsd:restriction>
                </xsd:simpleType>
              </xsd:element>
            </xsd:sequence>
          </xsd:complexType>
        </xsd:element>
      </xsd:schema>
      <ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
        <ogrid_cde>28</ogrid_cde>
        <role>T</role>
      </ogridroles>
      <ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
        <ogrid_cde>75</ogrid_cde>
        <role>T</role>
      </ogridroles>
      <ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
        <ogrid_cde>93</ogrid_cde>
        <role>O</role>
      </ogridroles>
      <ogridroles xmlns="urn:schemas-microsoft-com:sql:SqlRowSet1">
        <ogrid_cde>135</ogrid_cde>
        <role>O</role>
      </ogridroles>
    </root>';
    

    --The query

    WITH XMLNAMESPACES(DEFAULT 'urn:schemas-microsoft-com:sql:SqlRowSet1')
    SELECT OgrRol.value('(ogrid_cde/text())[1]','int') AS ogrid_cde
          ,OgrRol.value('(role/text())[1]','nvarchar(10)') AS [role]
    FROM @xml.nodes('/*:root/ogridroles') A(ogrRol);
    

    The result

    ogrid_cde   role
    28          T
    75          T
    93          O
    135         O
    

    Short explanation

    You can just ignore the XML-schema here. This would be useful in order to check the data integrity and validity. But assuming the data is correct, we can just read it.

    There is a (repeatedly defined) default namespace. But - this is important to see! - the element <root> itself is not living in this namespace!.

    Easy going, I've just used a namespace wildcard to address this element with *:root. In general, it's better to avoid wildcards, but in this case this seems acceptable.

    We use .nodes() to get a derived table of all the <ogridroles> elements. Then we us .value() to read the text() nodes of the code and the role element.

    Hope this helps, happy coding!