Search code examples
xmlfilelocaldtdxmllint

xmllint : how to validate an XML using a local DTD file


I have a local DTD file test.dtd. Content is :

<!DOCTYPE coord [
<!ELEMENT coord (date)>
<!ELEMENT date (#PCDATA)>
]>

I'd like to validate an XML using xmllint. This XML has no DOCTYPE in it :

<?xml version="1.0" encoding="x-mac-roman"?>
<coord>
    <date>20150312</date>
</coord>

No problem if I insert the DTD block as a 2nd line into a copy of my XML file and use:

xmllint --valid --noout my2.xml

But when I try :

xmllint --loaddtd test.dtd --valid --noout my.xml

xmllint --dtdvalid test.dtd --noout my.xml

both don't work. The outout is :

test.dtd:1: parser error : Content error in the external subset
<!DOCTYPE coord [
^
test.dtd:1: parser error : Content error in the external subset
<!DOCTYPE coord [
^
Could not parse DTD test.dtd

Any idea ? It seems that my XML MUST contain a DOCTYPE line (with SYSTEM keyword) to reference the external DOCTYPE file, that I want to avoid. See : http://www.w3schools.com/dtd/

Is there any solution without modifying the XML ?


Solution

  • First of all, external DTDs do not need the <!DOCTYPE preamble - remove it from the DTD file:

    <!ELEMENT coord (date)>
    <!ELEMENT date (#PCDATA)>
    

    Then, --loaddtd fetches an external DTD, which is not the same as validating against an external DTD. Use the --dtdvalid option as follows:

    $ xmllint --noout --dtdvalid test.dtd test.xml
    

    If the XML document is valid, xmllint will not output anything (because of --noout). If you change the DTD to, say:

    <!ELEMENT coord (date,other)>
    <!ELEMENT date (#PCDATA)>
    

    The output will be

    $ xmllint --noout --dtdvalid test.dtd test.xml
    test.xml:2: element coord: validity error : Element coord content does not follow the DTD, expecting (date , other), got (date )
    Document test.xml does not validate against test.dtd
    

    Look for more information on the doc pages of NMT or XMLSoft.