I have a few XSD files that define a report, wondering if there's a way load this into a table that I can query.
The elements XSD defines the required fields, it looks like:
<xs:simpleType name="NHSNumberStatusIndicatorCode_Withheld_Type">
<xs:restriction base="ns:AlphaNumeric_Type">
<xs:length value="2"/>
<xs:enumeration value="01"/>
<xs:enumeration value="02"/>
<xs:enumeration value="03"/>
<xs:enumeration value="04"/>
<xs:enumeration value="05"/>
<xs:enumeration value="06"/>
<xs:enumeration value="07"/>
<xs:enumeration value="08"/>
</xs:restriction>
</xs:simpleType>
I need to convert it to
Field_Name Type Length enumeration MaxLen MinLen
NHSNumber... AlphaNumeric_Type 2 01
NHSNumber... AlphaNumeric_Type 2 02
NHSNumber... AlphaNumeric_Type 2 03
and so on
Some fields don't have enumeration, they might look like:
<xs:simpleType name="CDSOrganisationIdentifier_Type">
<xs:restriction base="ns:AlphaNumeric_Type">
<xs:minLength value="3"/>
<xs:maxLength value="5"/>
</xs:restriction>
</xs:simpleType>
OR
<xs:simpleType name="Number_Type">
<xs:restriction base="xs:string">
<xs:pattern value="[0-9]{10}"/>
</xs:restriction>
</xs:simpleType>
You can load it into an XML variable and then use various SQL Server XML methods to select the nodes of interest and extract the values. See xml Data Type Methods and Path Expressions (XQuery) for more information.
In the following, I wrapped your XSD fragment in a "schema" element that also defined the "xs:" namespace prefix.
DECLARE @xml XML = '<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="NHSNumberStatusIndicatorCode_Withheld_Type">
<xs:restriction base="ns:AlphaNumeric_Type">
<xs:length value="2"/>
<xs:enumeration value="01"/>
<xs:enumeration value="02"/>
<xs:enumeration value="03"/>
<xs:enumeration value="04"/>
<xs:enumeration value="05"/>
<xs:enumeration value="06"/>
<xs:enumeration value="07"/>
<xs:enumeration value="08"/>
</xs:restriction>
</xs:simpleType>
</xs:schema >'
SELECT
N.node.value('(../../@name)[1]', 'nvarchar(max)') AS Field_Name,
N.node.value('(../@base)[1]', 'nvarchar(max)') AS Type,
N.node.value('(../xs:length/@value)[1]', 'int') AS Length,
N.node.value('@value[1]', 'nvarchar(max)') AS enumeration
FROM @xml.nodes('//xs:enumeration') N(node)
The .nodes()
method selects all of the enumeration nodes from the supplied xml. The leading //
selects nodes at any depth. The N(nodes)
provide arbitrary table and column aliases to the selected results. The .value()
method is then used to select attributes values from either the selected node, parent node, or sibling node in the case of Length. The @
prefix on a name refers to an attribute instead of a child node. Because many XPath selectors could potentially select more than one value, you often need to use [1]
or (...)[1]
to select just the first.
Result:
Field_Name | Type | Length | enumeration |
---|---|---|---|
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 01 |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 02 |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 03 |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 04 |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 05 |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 06 |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 07 |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 08 |
See this db<>fiddle.
The XSD can also be loaded into an XSD typed column of a table and accessed in a manner similar to the above by replacing @xml
with the column reference.
The above is coded for this specific enumeration type scenario. If your actual XSD contains other type definitions that don't follow this pattern, you may need to handle those cases separately.
UPDATE: To accommodate the following additional cases from your amended post ...
<xs:simpleType name="CDSOrganisationIdentifier_Type">
<xs:restriction base="ns:AlphaNumeric_Type">
<xs:minLength value="3"/>
<xs:maxLength value="5"/>
</xs:restriction>
</xs:simpleType>
<xs:simpleType name="Number_Type">
<xs:restriction base="xs:string">
<xs:pattern value="[0-9]{10}"/>
</xs:restriction>
</xs:simpleType>
a different approach is needed. After an initial .nodes()
call to get the <xs:simpleType ...>
nodes, additional calls are made to access selected subordinate nodes, some of which may or may not exist. Finally, the .value()
calls in the select list have been updated to pull data relative to one of the several selected nodes.
SELECT
S.Node.value('(@name)[1]', 'nvarchar(max)') AS Field_Name,
R.Node.value('(@base)[1]', 'nvarchar(max)') AS Type,
R.Node.value('(xs:length/@value)[1]', 'int') AS Length,
E.Node.value('@value[1]', 'nvarchar(max)') AS enumeration,
R.Node.value('(xs:maxLength/@value)[1]', 'int') AS MaxLength,
R.Node.value('(xs:minLength/@value)[1]', 'int') AS MinLength,
R.Node.value('(xs:pattern/@value)[1]', 'nvarchar(max)') AS Pattern
FROM @xml.nodes('//xs:simpleType') S(Node)
OUTER APPLY S.Node.nodes('xs:restriction') R(Node) -- Likely always one occurrence
OUTER APPLY R.Node.nodes('xs:enumeration') E(Node) -- Zero or more occurrences
An OUTER APPLY
is like a LEFT JOIN
to a subselect, except that no ON
condition used.
The result is:
Field_Name | Type | Length | enumeration | MaxLength | MinLength | Pattern |
---|---|---|---|---|---|---|
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 01 | null | null | null |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 02 | null | null | null |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 03 | null | null | null |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 04 | null | null | null |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 05 | null | null | null |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 06 | null | null | null |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 07 | null | null | null |
NHSNumberStatusIndicatorCode_Withheld_Type | ns:AlphaNumeric_Type | 2 | 08 | null | null | null |
CDSOrganisationIdentifier_Type | ns:AlphaNumeric_Type | null | null | 5 | 3 | null |
Number_Type | xs:string | null | null | null | null | [0-9]{10} |