Search code examples
xmlshellscripting

How to check if XML is "well formed" in a shell script


I am new to shell scripting.

Requirement:

I want to check if the XML is well formed or not. I don't have a schema or something to validate it against. I just want to check that it's well formed.

Ex of correct XML for me:

<heading>Reminder</heading>
<body>Don't forget me this weekend!</body> 

Incorrect XML for me:

<heading>Reminder</head>
<body>Don't forget me this weekend!</>

What I have tried so far:

I wrote a shell script command

xmllint --valid filename.xml  // replaced the filename.xml with my file.

Error I am getting:

valid_xml.xml:2: validity error : Validation failed: no DTD found !

The XML which I used for which I am getting error:

<?xml version="1.0" encoding="UTF-8"?>
<note>
<to>Tove</to>
<from>Jani</from> 
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

Solution

  • When you use --valid xmllint will attempt to validate the DTD, since you do not have one, it will fail with the error message (Validation failed: no DTD found !).

    To check whether the xml document is "well formed" use the following

    if xmllint filename.xml > /dev/null ; then
        echo "Valid"
    else
        echo "Fail"
    fi
    

    Note that none of the sample files provided by OP (except the last example) are well formed. The sample file marked correct is missing the top level XML tag. It should be:

    <root>
    <heading>Reminder</heading>
    <body>Don't forget me this weekend!</body> 
    </root>