Search code examples
xmlxsdrdfopencyc

Is 'M4I' or 'M4M' a valid XML Schema Integer? If yes, why and what is its meaning?


I'm currently working with a large XML file, the OpenCyc ontology. (You can download it as opencyc-latest.owl.gz from here: http://sw.opencyc.org/)

This XML file contains lines like these:

<owl:ObjectProperty rdf:about="Mx4rvVi4w5wpEbGdrcN5Y29ycA">
    <rdfs:label xml:lang="en">Arg 3 Genl</rdfs:label>
    <cycAnnot:label xml:lang="en">arg3Genl</cycAnnot:label>
    <!-- [...] -->

    <!-- [Strange lines begin here] -->
    <Mx4rvViAzpwpEbGdrcN5Y29ycA 
      rdf:datatype="http://www.w3.org/2001/XMLSchema#integer"
      >M4I</Mx4rvViAzpwpEbGdrcN5Y29ycA>
    <Mx4rv6Bnr5wpEbGdrcN5Y29ycA 
      rdf:datatype="http://www.w3.org/2001/XMLSchema#integer"
      >M4M</Mx4rv6Bnr5wpEbGdrcN5Y29ycA>
    <!-- [Strange lines ended here] -->

    <!-- [...] -->
</owl:ObjectProperty>

Don't worry about the tag names. That's how OpenCyc actually names its tags. I'd rather like to point the attention to their content.

For all not familiar with RDF/XML documents: The rdf:datatype attribute for the two strange lines basically says that the content of the tag should be interpreted as an XML Schema integer.

My questions boil down to: Are M4I and M4M (or other strange values that I found so far like M4E, M4Q, M4E) actually valid XML Schema integers? Or are these errors in the OpenCyc ontology?

If they are actually valid, what is their meaning? And why are they valid after all? (I.e. which documentation should I read to get insights about their meaning?)


Solution

  • The literals you're referring to are not valid integers. The representation of those in terms of the XML Schema type sytem, is available online at http://www.w3.org/TR/xmlschema-2/#integer.

    It basically says:

    integer has a lexical representation consisting of a finite-length sequence of decimal digits (#x30-#x39) with an optional leading sign. If the sign is omitted, "+" is assumed. For example: -1, 0, 12678967543233, +100000.

    According to the described semantics, your file is invalid.