Search code examples
xpathxmllint

xmllint return "XPath set is empty" on a valid query


I need to use full Xpath3 queries on a stream.

With the below content:

$ cat h3.html 
<body>
<h3 class="heading"><span class="content">First Title</span></h3>
<h3 class="no-num no-ref heading settled"><span class="level">45.6</span></h3>
<h3 class="no-num no-ref heading settled" id="informative"><span class="content">Informative References</span><a class="self-link"   href="#informative"></a></h3>
<h3 class="heading"><span class="content">More Stuff</span></h3>
</body>
$

xmllint tell me XPath set is empty:

$ xmllint --shell --xpath "//h3//span[@class="content"]/text()" h3.html 
XPath set is empty
/ > ^C
$ xmllint --xpath "//h3//span[@class="content"]/text()" h3.html 
XPath set is empty
$ xmllint --html --xpath "//h3//span[@class="content"]/text()" h3.html 
XPath set is empty
$ 

The version of xmllint (on Ubuntu 22.04):

$ xmllint --version
xmllint: using libxml version 20913
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer       XInclude Iconv ICU ISO8859X Unicode Regexps Automata Schemas Schematron Modules Debug Zlib Lzma
$

My XPath query should return (tested with hq who support only a subset of Xpath3):

$ cat h3.html | hq '//h3//span[@class="content"]/text()'
First Title
Informative References
More Stuff
$

Solution

  • I suggest to switch to single quotes:

    xmllint --xpath '//h3/span[@class="content"]/text()' h3.html
    

    Output:

    First Title
    Informative References
    More Stuff