I am using Saxon9EE.jar to validate a XML.
I have a assertion in simple type for my element to validate if the date coming in year is after 1900, which works perfect. But It will errors for all asserts which use the element name to validate.
My XSD:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:saxon="http://saxon.sf.net/">
<xs:element name="Root">
<xs:annotation>
<xs:appinfo>
<XSDVersion>1</XSDVersion>
<fieldSeparator>|</fieldSeparator>
<recordSeparator>\n</recordSeparator>
<allowDiscontinousOrder>true</allowDiscontinousOrder>
<allowIgnoreCase>false</allowIgnoreCase>
<allowLessFields>true</allowLessFields>
<removeInvalidChar>false</removeInvalidChar>
<enclosedChar/>
</xs:appinfo>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element name="Record" minOccurs="1" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="LoanOpenDate" nillable="false" minOccurs="0" maxOccurs="1">
<xs:annotation>
<xs:appinfo>
<format>AAAAAAA</format>
<originalName><![CDATA[LoanOpenDate]]></originalName>
<parent> </parent>
</xs:appinfo>
</xs:annotation>
<xs:simpleType>
<xs:restriction base="CMGDateFormat">
<xs:assertion test="if(string-length($value) != 0) then true() else false()" saxon:message="LoanOpenDate, should have a valid input"/>
<xs:assertion test="if(string-length($value) != 0 and string-length($value) = 10 ) then (xs:integer(substring($value,7,4)) > 1900) else true()" saxon:message="LoanOpenDate, should have a valid input, Year should be after 1900"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
<xs:element name="LoanClosedDate" minOccurs="0" maxOccurs="1">
<xs:annotation>
<xs:appinfo>
<format>AAAAAAA</format>
<originalName><![CDATA[LoanClosedDate]]></originalName>
<parent> </parent>
</xs:appinfo>
</xs:annotation>
<xs:simpleType>
<xs:restriction base="CMGDateFormat">
<xs:assertion test="if(string-length($value) != 0 and string-length($value) = 10 ) then (xs:integer(substring($value,7,4)) > 1900) else true()" saxon:message="LoanOpenDate, should have a valid input, Year should be after 1900"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
</xs:sequence>
<xs:attribute name="recordNumber" type="xs:string" use="required"/>
<xs:assert test="if(xs:integer(substring(LoanClosedDate,7,4)) > 1900 and xs:integer(substring(LoanOpenDate,7,4)) > 1900 and string-length(LoanClosedDate) != 0) then string-length(LoanOpenDate) != 0 else true()" saxon:message="LoanOpenDate, cannot be null if a LoanClosedDate exists"/>
<xs:assert test="if(string-length(LoanOpenDate) != 0 and string-length(LoanClosedDate) != 0 and xs:integer(substring(LoanClosedDate,7,4)) > 1900 and xs:integer(substring(LoanOpenDate,7,4)) > 1900 and xs:long(concat(substring(LoanClosedDate,7,4),substring(LoanClosedDate,1,2),substring(LoanClosedDate,4,2))) != xs:long(concat(substring(LoanOpenDate,7,4),substring(LoanOpenDate1,2), substring(LoanOpenDate,4,2))))
then
(xs:date(concat(substring( LoanOpenDate,7 ,4 ) ,'-',substring(LoanOpenDate,1,2 ),'-', substring(LoanOpenDate,4,2))) < (xs:date(concat(substring(LoanClosedDate,7,4),'-',substring(LoanClosedDate,1,2),'-',substring(LoanClosedDate,4,2)))))
else true()" saxon:message="LoanOpenDate, cannot be a date after LoanClosedDate, cannot be null if a LoanClosedDate exists, cannot be equal to LoanClosedDate"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:simpleType name="CMGDateFormat">
<xs:annotation>
<xs:documentation>This type is used for dates requested in mm/dd/yyyy format.</xs:documentation>
</xs:annotation>
<xs:restriction base="xs:string">
<xs:pattern value="((((0[1-9]|1[012])[/](0[1-9]|1[0-9]|2[0-8]))|((0[13578]|1[02])[/](29|30|31))|((0[4,6,9]|11)[/](29|30)))[/](19|[2-9][0-9])\d\d)|(02[/]29[/](19|[2-9][0-9])(00|04|08|12|16|20|24|28|32|36|40|44|48|52|56|60|64|68|72|76|80|84|88|92|96))|\s*"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
This is the XML I am validating against:
<?xml version="1.0" encoding="ISO-8859-1"?>
<Root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Record recordNumber = "1" >
<LoanOpenDate><![CDATA[08/06/2008]]></LoanOpenDate>
<LoanClosedDate><![CDATA[10/10/1900]]></LoanClosedDate>
</Record>
</Root>
I am Expecting that saxon should not throw error for asserts on record level. There should be one error generated at element level. But that is not the case
<?xml version="1.0" encoding="UTF-8"?>
<validation-report xmlns="http://saxon.sf.net/ns/validation"
system-id="file:/K:/redir/My%20Documents/MyJabberFiles/[email protected]/SaxonStandalone/Loantest.xml">
<error line="7"
column="18"
path="/Q{}Root[1]/Q{}Record[1]/Q{}LoanClosedDate[1]"
xsd-part="2"
constraint="cvc-datatype-valid.1">The content "10/10/1900" of element <LoanClosedDate> does not match the required simple type. Value "10/10/1900" contravenes the assertion facet "if(string-length($value) != 0 ..." of the type of element LoanClosedDate. LoanOpenDate, should have a valid input, Year should be after 1900</error>
<error line="3"
column="30"
path="/Q{}Root[1]/Q{}Record[1]"
xsd-part="1"
constraint="sec-cvc-assertion.0">Element Record does not satisfy assertion. LoanOpenDate, cannot be a date after LoanClosedDate, cannot be null if a LoanClosedDate exists, cannot be equal to LoanClosedDate</error>
<error line="3"
column="30"
path="/Q{}Root[1]/Q{}Record[1]"
xsd-part="1"
constraint="sec-cvc-assertion.0">Element Record does not satisfy assertion. LoanOpenDate, cannot be null if a LoanClosedDate exists</error>
<meta-data>
<validator name="SAXON-EE" version="9.8.0.4"/>
<results errors="3" warnings="0"/>
<schema file="Loan1.xsd" xsd-version="1.1"/>
<run at="2018-01-30T10:50:42.45-06:00"/>
</meta-data>
</validation-report>
Can you let me know if there is a workaround or if it can be bug with saxon.
Just as when compiling a language like Java, it's a tough decision how many errors to report. You don't want to stop reporting after the first one, because people would like to correct all the errors before trying validation again, but you don't want to report the same error in several different ways just because more than one rule in the language spec (or in this case, the schema) is violated.
It would certainly be possible to say "don't evaluate assertions on a parent element if any of the child elements has been found to be invalid". But then, if an assertion was false for reasons unrelated to the error in the child element, there would be two errors in your instance document and only one of them would be reported.
The ideal solution might be to say "don't evaluate assertions on a parent element if it needs access to children that have been found to be invalid". But that's really hard to implement.
In this situation you've got one piece of invalid data in your instance documents that causes three rules in your schema to fail, and Saxon is reporting all three. I'm afraid that's simply the way that it works.
Some people like to organize validation in multiple phases so you check the structural rules first, and then the "business" rules. That involves a pipeline of multiple schemas. It's possible, but it's a lot of work.
You could also try to be smart and filter the validation report using XSLT, looking at the paths associated with the individual errors and grouping them. The main reason we introduced an XML validation report was to allow this kind of tailoring.