Search code examples
xmlxsdxsd-validationxml-validationstandards-compliance

How to validate naming conventions in an XML Schema


We're working on naming conventions for XML Schemas (XSD) we're using inside our company.
To make sure everyone will honour the conventions I'm looking for a way to validate the XSD.

The IDE we're using is IntelliJ,so that should be able to use an XSD file to validate other XSD files.

One way would be to use an extended version of the XSD standard from W3C, but that has some implications. Another is to just write some Python or Java code to validate our specific conventions but that feels kinda weird.

Some of the things we would like to check:

  1. A simple element name should always be camelcase with leading uppercase. It should also finish with SType.
  2. A complex item should also be camelcased with leading uppercase, but it should end with CType.
  3. If a simple type contains a restriction with enumeration, it should end with QualifierSType.
  4. If a complex type only contains a list of a simple type, it should always end with ListCType.

Is there some standard tool that can accomplish these validations, or do we have to develop something by ourselves? And what would be the best approach to accomplish this?


Solution

  • I've automated WIPO ST.96 XML Design Rules and Conventions conformance via Schematron. Many rules are simple to represent, but some naming conventions such as CamelCase would require some serious dictionary-driven code. Consider:

    • GD-10: Type names MUST use the UCC convention and have the suffix Type. For example, ApplicantType.

    We decided to forego the lexical sophistication needed for full UCC verification but at least check that the name does start with a capital letter, is not all uppercase, and ends with the required suffix:

      <pattern>
        <title>GD-10</title>
        <rule context="xsd:complexType[@name] | xsd:simpleType[@name]">
          <assert test="fnx:is-exception('GD-10')
                        or matches(@name,'^[A-Z]')" flag="AUTO" role="ERROR">
            The <value-of select="local-name()"/> name <value-of
            select="@name"/> does not start with an upper-case letter.
          </assert>
          <assert test="fnx:is-exception('GD-10') or
                        not(matches(@name,'^[A-Z]+$'))"
                  flag="AUTO" role="ERROR">
            The <value-of select="local-name()"/> name <value-of
            select="@name"/> contains all upper-case letters instead of
            using camel case.
          </assert>
          <assert test="fnx:is-exception('GD-10') or
                        ends-with(@name,'Type')" flag="AUTO" role="ERROR">
            The <value-of select="local-name()"/> name <value-of
            select="@name"/> does not end with Type.
          </assert>
        </rule>
      </pattern>
    

    (fnx:is-exception() is just a little utility function that allows an organization to opt out of any given convention via an entry in a configuration file.)