Search code examples
phpxmlxsdsaxonsaxon-c

Saxon php xml schema validation with entity reference


I'm developing a php application with saxon c api EE edition which need to validate xml files against xsd schema.

i'm getting the below error when i do the validation.

org.xml.sax.SAXParseException; systemId: file:**path**/temp.xml; lineNumber: 6; columnNumber: 48; The entity "nbsp" was referenced, but not declared

my xml file content is

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE section [
<!ENTITY % ent1 SYSTEM "isonum.ent">
]>
<section>
    <section-heading>This is a test Heading &nbsp; and &amp; check</section-heading>
    <section>
        <section-heading>Another sub section heading with &nbsp; and &amp; check</section-heading>
        
    </section>
</section>

there is a reference in the xml for a entity file isonum.ent which is plased in the same path where xml file is in.

the entity file has definition for  

<!ENTITY rdquo  "&#x201D;" ><!--=double quotation mark, right-->
<!ENTITY nbsp   "&#160;" ><!--=no break (required) space-->
<!ENTITY shy    "&#173;" ><!--=soft hyphen-->

my php code for validation is below

    $proc = new Saxon\SaxonProcessor(true);
    $proc->setConfigurationProperty("xsdversion", "1.1");
    $proc->setConfigurationProperty("http://saxon.sf.net/feature/validationWarnings", "true");
    $proc->setConfigurationProperty("http://saxon.sf.net/feature/multipleSchemaImports", "on");

    $val = $proc->newSchemaValidator();
    $val->registerSchemaFromFile($xsd_path);
    $val->setProperty("report-node", "true");    
    $val->setProperty("verbose", "true");
    $val->validate($xml_path);

I referred the documentation available in https://www.saxonica.com/saxon-c/documentation/index.html and also the samples provided with the library download zip but could identify the solution..

How can i mention the the Schema validator where to look for the entity files. And also is possible to get all the errors at once, because in this case the validation returned only one &nbsp; issue where as there are two &nbsp;'s in the file.


Solution

  • This turns out to be a simple user error. The DTD declares a parameter entity but does not reference it, so the content of the parameter entity does not become part of the DTD. It needs to be written:

    <?xml version="1.0" encoding="utf-8"?>
    <!DOCTYPE section [
    <!ENTITY % ent1 SYSTEM "isonum.ent">
    %ent1;
    ]>