Search code examples
c#xml.net-corexml-parsingxml-schema-collection

Get column names and types out of XMLSchema


I am working in c#, attempting to extract the column names and their types, out of an xml schema. Here is the schema as I receive it:

<xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:MainDataTable="WbDT" msdata:UseCurrentLocale="true">
<xs:complexType>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element name="WbDT">
        <xs:complexType>
        <xs:sequence>
            <xs:element name="TranslateNo" type="xs:int" minOccurs="0" />
            <xs:element name="EngWord" type="xs:string" minOccurs="0" />
            <xs:element name="LanguageCd" type="xs:string" minOccurs="0" />
            <xs:element name="TranslateWord" type="xs:string" minOccurs="0" />
            <xs:element name="UpdateDTS" type="xs:dateTime" minOccurs="0" />
            <xs:element name="UpdateBy" type="xs:string" minOccurs="0" />
        </xs:sequence>
        </xs:complexType>
    </xs:element>
    </xs:choice>
</xs:complexType>
</xs:element>

What I'd like to get out of that is a Dictionary where each entry is "ColumnName","ColumnType".

I am struggling with the synatax a little though, here is what I have so far:

var xml = result.WbDT;
XmlSchemaSet schemaSet = new XmlSchemaSet();
string xmlheader = "<?xml version='1.0' encoding='utf-8'?>" + System.Environment.NewLine;
string allxml = xmlheader + xml.Any[0].ToString();
schemaSet.Add("", XmlReader.Create(new StringReader(allxml)));
schemaSet.Compile();

XmlSchema customerSchema = null;
foreach (XmlSchema schema in schemaSet.Schemas())
{
    customerSchema = schema;
}

foreach (XmlSchemaElement element in customerSchema.Elements.Values)
{
    Console.WriteLine("Element: {0}", element.Name);

so far so good, i now have a schema with 1 element in it.. Now what I need to do is dig into the <xs:complexType> -> <xs:element name="WbDT"> -> <xs:sequence> element and iterate through that to get the column names and their types.

The syntax I am working with so far is:

XmlSchemaComplexType complexType = element.ElementSchemaType as XmlSchemaComplexType;
XmlSchemaSequence sequence = complexType.ContentTypeParticle as XmlSchemaSequence;

//// Iterate over each XmlSchemaElement in the Items collection.
foreach (XmlSchemaElement childElement in sequence.Items)
{
    Console.WriteLine("Element: {0}", childElement.Name);
}

but when I try to run this code I get a null exception on sequence.items

I'm trying to follow along with this microsoft documentation: https://learn.microsoft.com/en-us/dotnet/standard/data/xml/traversing-xml-schemas

I'm pretty sure that the problem is my xml has a deeper structure than the one in the example,

the example xml

<xs:element name="Customer">  
    <xs:complexType>  
        <xs:sequence>  
            <xs:element name="FirstName" type="xs:string" />

my xml

<xs:schema id="NewDataSet" xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xs:element name="NewDataSet" msdata:IsDataSet="true" msdata:MainDataTable="WbDT" msdata:UseCurrentLocale="true">
<xs:complexType>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element name="WbDT">
        <xs:complexType>
        <xs:sequence>
            <xs:element name="TranslateNo" type="xs:int" minOccurs="0" />

Solution

  • Your problem is indeed that your XML has a deeper hierarchy than the example you are following. You just have to add in some extra steps to handle that:

    foreach (XmlSchemaElement element in customerSchema.Elements.Values)
    {
        Console.WriteLine("Element: {0}", element.Name);
        XmlSchemaComplexType complexType = element.ElementSchemaType as XmlSchemaComplexType;
        XmlSchemaChoice choice = complexType.ContentTypeParticle as XmlSchemaChoice;
        XmlSchemaElement outerElement = choice.Items.Cast<XmlSchemaElement>().First();
        XmlSchemaComplexType innerComplexType = outerElement.ElementSchemaType as XmlSchemaComplexType;
        XmlSchemaSequence xmlSchemaSequence = innerComplexType.ContentTypeParticle as XmlSchemaSequence;
    
        //// Iterate over each XmlSchemaElement in the Items collection.
        foreach (XmlSchemaElement childElement in xmlSchemaSequence.Items)
        {
            Console.WriteLine("Element: {0}", childElement.Name);
        }
    }