I'm trying to get text from an rss 2.0 feed (description tag) using XQilla. The address is here. This is fine but the tag contains escaped HTML like
"<a href="some_address>..."
It would be useful to have this HTML in a node and further work with it, but I am at a loss here. I can get the tag contents with
let $desc := $item/*[name()='description']
but do not know how to unescape it. I tried parse-html, which only strips the text of tags and returns a string, like the data() function. Searching on the web suggests that extension functions exist for this, but in other parsers. Is there a way to do it in XQilla? By the way, the code I am working on is a JAWS ResearchIt lookup source.
XQilla has – like lots of other XQuery implementations – a proprietary function to load XML and HTML from a string (they don't have anchor tags, thus you need to scroll through the document, I'm sorry).
xqilla:parse-xml($xml as xs:string?) as document-node()?
xqilla:parse-html($html as xs:string?) as document-node()?
Given $desc
contains the unparsed HTML, xqilla:parse-html($desc)
will return the parse result.