How to interactively use tDOM?

I feel that I'm missing something subtle here.

I have a $doc which I can see with $doc asText really contains the content of the page to be parsed. It came from dom parse -html5 $body.

From here, I'd like to interactively explore the DOM. For example, to get a list of anchors. It seems like $doc selectNodes {//a} would work*, but that doesn't return anything. Neither does anything else I try with selectNodes (/head, /body, /html ...nothing!). I can see that there are childNodes so the structure seems to be intact.

What is the better way to explore these nodes so I can figure out what is going wrong?

https://wiki.tcl-lang.org/page/XPath - this is what I was trying to follow

Solution

You can simplify your life, this time, as you seem to work with HTML (not XML, or XHTML for that matter) because you pass -html5 to dom parse, and you select for HTML elements (anchors).

So far, HTML has no meaning of namespaces, so you may ignore them. Use the -ignorexmlns flag to dom parse.

% package req tdom
0.9.2
% set someHTML {<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8">
    <title>Title of the document</title></head><body>
    <svg width="100" height="100">
      <circle cx="50" cy="50" r="40" stroke="green" stroke-width="4" fill="yellow" />
    </svg>
  </body>
</html>}
% set doc [dom parse -html5 -ignorexmlns $someHTML]

This way, you will be able to run your XPath queries, expressions w/o namespace awareness:

$doc selectNodes {//svg}

Note that is a recommended use of tDOM:

Since this probably isn't wanted by a lot of users and adds only burden for no good in a lot of use cases -html5 can be combined with -ignorexmlns, in which case all nodes and attributes in the DOM tree are not in an XML namespace.