I need to use full Xpath3 queries on a stream.
With the below content:
$ cat h3.html
<body>
<h3 class="heading"><span class="content">First Title</span></h3>
<h3 class="no-num no-ref heading settled"><span class="level">45.6</span></h3>
<h3 class="no-num no-ref heading settled" id="informative"><span class="content">Informative References</span><a class="self-link" href="#informative"></a></h3>
<h3 class="heading"><span class="content">More Stuff</span></h3>
</body>
$
xmllint
tell me XPath set is empty:
$ xmllint --shell --xpath "//h3//span[@class="content"]/text()" h3.html
XPath set is empty
/ > ^C
$ xmllint --xpath "//h3//span[@class="content"]/text()" h3.html
XPath set is empty
$ xmllint --html --xpath "//h3//span[@class="content"]/text()" h3.html
XPath set is empty
$
The version of xmllint (on Ubuntu 22.04):
$ xmllint --version
xmllint: using libxml version 20913
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ICU ISO8859X Unicode Regexps Automata Schemas Schematron Modules Debug Zlib Lzma
$
My XPath query should return (tested with hq
who support only a subset of Xpath3):
$ cat h3.html | hq '//h3//span[@class="content"]/text()'
First Title
Informative References
More Stuff
$
I suggest to switch to single quotes:
xmllint --xpath '//h3/span[@class="content"]/text()' h3.html
Output:
First Title Informative References More Stuff