Search code examples
xmlxml-parsingxmlstarlet

"Opening and ending tag mismatch" of `xmlstarlet sel`


http://doi.cnki.net/Resolution/Handler?doi=10.13345/j.cjb.180087

When I run xmlstarlet sel -t -v '//i[@class = "iconSucc"]/@class' on the following file, I got the following error messages. Does anybody know how to fix the problem?

-:77.294: Opening and ending tag mismatch: img line 77 and a
 middle;" src="/Content/images/gongshangbiaoshi.gif" alt="" class="footpic"></a>
                                                                               ^
-:77.301: Opening and ending tag mismatch: a line 77 and span
;" src="/Content/images/gongshangbiaoshi.gif" alt="" class="footpic"></a></span>
                                                                               ^
-:77.332: Opening and ending tag mismatch: span line 77 and p
i.gif" alt="" class="footpic"></a></span><br />©2014-2018中国知网(CNKI) </p
                                                                               ^
-:79.15: Opening and ending tag mismatch: p line 74 and div
        </div>
              ^
-:92.8: Opening and ending tag mismatch: div line 61 and body
</body>
       ^
-:93.8: Opening and ending tag mismatch: body line 11 and html
</html>
       ^
-:94.1: Premature end of data in tag html line 2

^

Solution

  • Your HTML file is not well-formed. The <img> element at line 77

    <img style="height: 24px; border: 0px none; vertical-align: middle;" src="/Content/images/gongshangbiaoshi.gif" alt="" class="footpic" >
    

    is not closed. Add a closing tag ... /> to make it well-formed:

    <img style="height: 24px; border: 0px none; vertical-align: middle;" src="/Content/images/gongshangbiaoshi.gif" alt="" class="footpic" />
    

    Then the output will be:

    iconSucc

    EDIT:

    Using xmllint, you can achieve the result with one command:

    xmllint -html -xmlout Handler.xml | xmlstarlet sel -t -v '//i[@class = "iconSucc"]/@class'