Search code examples
.netxmlxpathxsd

Can I validate an XPath expression against an XML schema?


You can verify an XPath expression against an XML doc to verify it, but is there an easy way to verify the same XPath expression against the schema for that document?

Say I have an XSD schema like this:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" ... etc>
  <xsd:element name="RootData">
    <xsd:complexType>
      <xsd:sequence minOccurs="0">
        <xsd:element name="FirstChild">
          <xsd:complexType>
            <xsd:sequence minOccurs="0">
              <xsd:element name="FirstGrandChild">
... etc etc

Is there an easy or built-in way to verify that the XPath:

/RootData/FirstChild/FirstGrandChild

would be valid against any XML documents that may be based on that schema? (Edit: I guess I mean potentially valid; the actual XML document might not contain those elements, but that XPath could still be considered potentially valid for the schema. Whereas, say, /RootData/ClearlyInvalidChild/ThisElementDoesntExistEither is clearly invalid.)

Of course I could only expect this to work against canonical XPath expressions rather than ones of arbitrary complexity, but that's fine.

I'm specifically thinking in .NET but am curious if other implementations make it possible. It's not so important that I want to roll my own, for example I don't really want to write my own code to transform that XPath expression into another one like:

/xsd:schema/xsd:element[@name='RootData']/xsd:complexType/xsd:sequence/xsd:element[@name='FirstChild']/...etc...

... though I know I could do that if I really had to.

Cheers!


Solution

  • We actually did a research project on this, and implemented an XPath verifier, sometime around 2000. This was for XPath 1. I am not aware of any currently available libraries that you can use to do this.

    If you want to go and implement this yourself, here are some hints:

    • You will not be able to transform a path over an instance document into a path over a schema as you do above. For example, /a//b does not transform into /xsd:element[@name='a']//xsd:element[@name='b'] because element b may be defined at the top level of the schema, not underneath b.

    • Remember that while an XML document is a tree, a schema is a graph. If you search descendant paths like //a, you will have to decide when to terminate the search or it may continue forever (e.g. imagine in a element "a" that contains "b", which contains "a")

    • Some paths will be undecidable or at least very hard to decide. For example //*[starts-with(@name, 'foo')]

    If you're still up for it, I suggest using a library like eclipse's XSD or the .NET schema loading classes to load the schema into memory and do your checking in code.