Search code examples
xsdxpath-2.0

XML Schema 1.1: asserting the number of permissible siblings with matching id values


I have the following (intended to be valid) XML:

<Series>
    <A>
        <id>1</id>
    </A>
    <B>
        <id>1</id>
    </B>
    <B>
        <id>1</id>
    </B>
    <B>
        <id>2</id>
    </B>
</Series>

Using XSD 1.1 and XPath 2.0, I want to assert the maximum number of elements named "B" that share the same "id" value as "A". Specifically, I want to limit the number of elements of name "B" that can have "id"=1, to specifically 2 occurrences. I don't care how many elements named B there are with other "id" values that don't match A's id="1" (so there could be a million of <B><id>2</id></B>, and it would still validate.

Here's my attempted XML Schema 1.1 to enforce this, with an XPath 2.0 expression in the assert directive:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="Series" type="series"/>
  <xs:complexType name="series">
    <xs:sequence>
        <xs:element name="A" type="a"/>
        <xs:element name="B" type="b" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
  <xs:complexType name="a">
    <xs:sequence>
      <xs:element name="id" type="xs:string"/>
    </xs:sequence>
    <xs:assert test="count(following-sibling::element(B)[id/text() = ./id/text()]) eq 2"/>
  </xs:complexType>
  <xs:complexType name="b">
    <xs:sequence>
        <xs:element name="id" type="xs:string"/>
    </xs:sequence>
  </xs:complexType>
</xs:schema>

But my assertion always fails with a cvc assertion error message. If I try to relax the assert to just <xs:assert test="count(following-sibling::element(B)) ge 1"/>, it still fails, so it seems like XML Schema 1.1 can't handle all XPath 2.0 constructs?

In general, is there a way to make assertions on a sibling in XML Schema 1.1? Thank you!


Solution

  • XML Schema 1.1 assertions can handle all of XPath 2.0, but according to the spec (section 3.13.4)

    1.3 From the "partial" ·post-schema-validation infoset·, a data model instance is constructed as described in [XDM]. The root node of the [XDM] instance is constructed from E; the data model instance contains only that node and nodes constructed from the [attributes], [children], and descendants of E.

    Note: It is a consequence of this construction that attempts to refer, in an assertion, to the siblings or ancestors of E, or to any part of the input document outside of E itself, will be unsuccessful. Such attempted references are not in themselves errors, but the data model instance used to evaluate them does not include any representation of any parts of the document outside of E, so they cannot be referred to.

    (My bold). When evaluating an assertion you only have the sub tree rooted at the element that hosts the assertion, so if you want to assert a relation between elements in different parts of the tree you must put the assertion on one of their common ancestors.

    I presume it was designed like this to allow a validating parser to evaluate assertions during parsing, at the end of the relevant element, rather than having to wait until the whole tree has been built and then evaluating all assertions en masse.