Search code examples
xmlnhibernatexsdnhibernate-mappingfluent-nhibernate-mapping

NHibernate mapping from dtd or xsd or even xml document itself


I have searched and searched but couldn't find anything even remotely resembling an answer.

I have a DTD (xml-dtd) for an XML document type (my document type, nothing NHibernate-related) and I would like to store all documents of that given document type into a relational model. No, I don't want to store the XML file itself into the database as some sort of a varchar or XML field or whatnot, that defeats the purpose - I want instead to break it down into its elements and attributes and store THAT instead, as a proper relational model. And XML supports that.

I can, in visual studio 2015, create an XSD from the given DTD and then using that XSD create C# classes that reflect the given XSD (and by extension the original DTD). Documents parse and everything is OK.

Now the question is how to store those XML documents through NHibernate into an RDBMS without (lots of) hand-coding whereby all relations present as such? There has to be an easy way by using auto mapping features but there are some "limitations" of the original DTD (some IDREF stuff and such, relational stuff) that I would like to be "converted" into relations on the fly and have proper relations to other classes rather than storing "code" as value of type string.

So basically, I need an NHibernate && XML && DTD && XSD guru at the same time to shed some light on how this can easily be achieved. I was 100% certain that this kind of thing was "normal" with hibernate and nhibernate for the past 10-15 years at least (and I never tried, seems to be the first time I need to store XML documents in the database broken down to their constituent bits rather than as a whole).

In case such a thing is not possible, then is there even such a thing as "XML document driver" for NHibernate so that it doesn't have to go into a RDBMS but remain as XML document on the file system?

Example (for the SGML/XML gurus out there look at IDREF and NMTOKEN, why are they "just" strings and not proper relations to where they should go, i.e. another Class or usage or variant or whatever?): End result:

I would expect the Reference to be the "Class" with "code" (or id) G117 itself (as in public virtual Class Reference { get; set; }

Actual result: just a string "code" with a value "G117" as in

    [System.Xml.Serialization.XmlAttributeAttribute(DataType="NMTOKEN")]
    public string code {
        get {
            return this.codeField;
        }
        set {
            this.codeField = value;
        }
    }

XML file:

<Class code="G117" kind="process">
    <SuperClass code="G"/>
    <Rubric id="13-223" kind="preferred">
        <Label xml:lang="en">A preferred label</Label>
    </Rubric>
    <Rubric id="13-224" kind="shortTitle">
        <Label xml:lang="en">Short title</Label>
    </Rubric>
    <Rubric id="13-225" kind="exclusion">
        <Label xml:lang="en">There is some exclusion text with a reference here <Reference>G12</Reference></Label>
    </Rubric>
    <Rubric id="13-226" kind="criteria">
        <Label xml:lang="en">Some criteria text goes here</Label>
    </Rubric>
</Class>

<Rubric id="56-327" kind="exclusion">
    <Label xml:lang="en">This is some thext that might refer someplace <Reference>G117</Reference>; and another piece of text that refers to another place <Reference>BF9</Reference>; Another text describing something and there might be a reference from this piece of text somewhere else too <Reference>AB7</Reference></Label>
</Rubric>

DTD:

<!ELEMENT Class (Meta*,SuperClass*,SubClass*,ModifiedBy*,ExcludeModifier*,Rubric*,History*)>
<!ATTLIST Class code NMTOKEN #REQUIRED kind IDREF #REQUIRED usage IDREF #IMPLIED variants IDREFS #IMPLIED>
<!ELEMENT Rubric (Label+,History*)>
<!ATTLIST Rubric id ID #IMPLIED kind IDREF #REQUIRED usage IDREF #IMPLIED>
<!ELEMENT Reference (#PCDATA)>
<!ATTLIST Reference classCode CDATA #IMPLIED authority NMTOKEN #IMPLIED uid NMTOKEN #IMPLIED code NMTOKEN #IMPLIED usage IDREF #IMPLIED variants IDREFS #IMPLIED>

Resulting XSD:

  <xs:element name="Class">
    <xs:complexType>
      <xs:sequence>
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="Meta" />
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="SuperClass" />
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="SubClass" />
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="ModifiedBy" />
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="ExcludeModifier" />
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="Rubric" />
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="History" />
      </xs:sequence>
      <xs:attribute name="code" type="xs:NMTOKEN" use="required" />
      <xs:attribute name="kind" type="xs:IDREF" use="required" />
      <xs:attribute name="usage" type="xs:IDREF" />
      <xs:attribute name="variants" type="xs:IDREFS" />
    </xs:complexType>
  </xs:element>
  <xs:element name="Rubric">
    <xs:complexType>
      <xs:sequence>
        <xs:element minOccurs="1" maxOccurs="unbounded" ref="Label" />
        <xs:element minOccurs="0" maxOccurs="unbounded" ref="History" />
      </xs:sequence>
      <xs:attribute name="id" type="xs:ID" />
      <xs:attribute name="kind" type="xs:IDREF" use="required" />
      <xs:attribute name="usage" type="xs:IDREF" />
    </xs:complexType>
  </xs:element>
  <xs:element name="Reference">
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base="xs:string">
          <xs:attribute name="classCode" type="xs:string" />
          <xs:attribute name="authority" type="xs:NMTOKEN" />
          <xs:attribute name="uid" type="xs:NMTOKEN" />
          <xs:attribute name="code" type="xs:NMTOKEN" />
          <xs:attribute name="usage" type="xs:IDREF" />
          <xs:attribute name="variants" type="xs:IDREFS" />
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

Generated C# classes

/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "4.6.1055.0")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true, Namespace="http://tempuri.org/MyStuff")]
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://tempuri.org/MyStuff", IsNullable=false)]
public partial class Class {

    private Meta[] metaField;

    private SuperClass[] superClassField;

    private SubClass[] subClassField;

    private ModifiedBy[] modifiedByField;

    private ExcludeModifier[] excludeModifierField;

    private Rubric[] rubricField;

    private History[] historyField;

    private string codeField;

    private string kindField;

    private string usageField;

    private string variantsField;

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("Meta")]
    public Meta[] Meta {
        get {
            return this.metaField;
        }
        set {
            this.metaField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("SuperClass")]
    public SuperClass[] SuperClass {
        get {
            return this.superClassField;
        }
        set {
            this.superClassField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("SubClass")]
    public SubClass[] SubClass {
        get {
            return this.subClassField;
        }
        set {
            this.subClassField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("ModifiedBy")]
    public ModifiedBy[] ModifiedBy {
        get {
            return this.modifiedByField;
        }
        set {
            this.modifiedByField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("ExcludeModifier")]
    public ExcludeModifier[] ExcludeModifier {
        get {
            return this.excludeModifierField;
        }
        set {
            this.excludeModifierField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("Rubric")]
    public Rubric[] Rubric {
        get {
            return this.rubricField;
        }
        set {
            this.rubricField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("History")]
    public History[] History {
        get {
            return this.historyField;
        }
        set {
            this.historyField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="NMTOKEN")]
    public string code {
        get {
            return this.codeField;
        }
        set {
            this.codeField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="IDREF")]
    public string kind {
        get {
            return this.kindField;
        }
        set {
            this.kindField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="IDREF")]
    public string usage {
        get {
            return this.usageField;
        }
        set {
            this.usageField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="IDREFS")]
    public string variants {
        get {
            return this.variantsField;
        }
        set {
            this.variantsField = value;
        }
    }
}

/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "4.6.1055.0")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true, Namespace="http://tempuri.org/MyStuff")]
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://tempuri.org/MyStuff", IsNullable=false)]
public partial class Rubric {

    private Label[] labelField;

    private History[] historyField;

    private string idField;

    private string kindField;

    private string usageField;

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("Label")]
    public Label[] Label {
        get {
            return this.labelField;
        }
        set {
            this.labelField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlElementAttribute("History")]
    public History[] History {
        get {
            return this.historyField;
        }
        set {
            this.historyField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")]
    public string id {
        get {
            return this.idField;
        }
        set {
            this.idField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="IDREF")]
    public string kind {
        get {
            return this.kindField;
        }
        set {
            this.kindField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="IDREF")]
    public string usage {
        get {
            return this.usageField;
        }
        set {
            this.usageField = value;
        }
    }
}

/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "4.6.1055.0")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(AnonymousType=true, Namespace="http://tempuri.org/MyStuff")]
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://tempuri.org/MyStuff", IsNullable=false)]
public partial class Reference {

    private string classCodeField;

    private string authorityField;

    private string uidField;

    private string codeField;

    private string usageField;

    private string variantsField;

    private string valueField;

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute()]
    public string classCode {
        get {
            return this.classCodeField;
        }
        set {
            this.classCodeField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="NMTOKEN")]
    public string authority {
        get {
            return this.authorityField;
        }
        set {
            this.authorityField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="NMTOKEN")]
    public string uid {
        get {
            return this.uidField;
        }
        set {
            this.uidField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="NMTOKEN")]
    public string code {
        get {
            return this.codeField;
        }
        set {
            this.codeField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="IDREF")]
    public string usage {
        get {
            return this.usageField;
        }
        set {
            this.usageField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlAttributeAttribute(DataType="IDREFS")]
    public string variants {
        get {
            return this.variantsField;
        }
        set {
            this.variantsField = value;
        }
    }

    /// <remarks/>
    [System.Xml.Serialization.XmlTextAttribute()]
    public string Value {
        get {
            return this.valueField;
        }
        set {
            this.valueField = value;
        }
    }
}

Solution

  • Well, Oskar, and others who might be interested in something like this, it took me 12 days but in the end I did figure it out and it works. It takes a few seconds (maybe 2) to create the mappings, database and fill it in with data from xml but then it works beautifully. Still not sure how to map "IDREF" types to real IDs but that's a minor issue compared to what I was going through the last few days.