Search code examples
javascripthtmldom

Nested subtree traversal in the insert algorithm DOM


In general, those who worked with DOM, heard about insertion methods, for example, append, prepend and some others.

I decided to study how the whatwg specification describes them and I encountered difficulties.

This is what I am talking about (insert algorithm):

  1. For each node in nodes, in tree order:
    1. Adopt node into parent’s node document.
    2. If child is null, then append node to parent’s children.
    3. Otherwise, insert node into parent’s children before child’s index.
    4. If parent is a shadow host whose shadow root’s slot assignment is "named" and node is a slottable, then assign a slot for node.
    5. If parent’s root is a shadow root, and parent is a slot whose assigned nodes is the empty list, then run signal a slot change for parent.
    6. Run assign slottables for a tree with node’s root.
    7. For each shadow-including inclusive descendant inclusiveDescendant of node, in shadow-including tree order:
      1. Run the insertion steps with inclusiveDescendant.
      2. If inclusiveDescendant is connected, then:
        1. If inclusiveDescendant is custom, then enqueue a custom element callback reaction with inclusiveDescendant, callback name "connectedCallback", and an empty argument list.
        2. Otherwise, try to upgrade inclusiveDescendant.

That is, the algorithm indicates that we have a cycle in step 7 in which it goes through the nodes of a light tree recursively (DFS) and performs some steps for each node.

But we also have a substep 7.7, which is a subcycle and which, logically, is even broader since it looks at not only light nodes, but also looks at shadow nodes and their descendants.

Now let's imagine such an example, we have some document there is a body node in which it is empty and we want to insert complex markup (of course it will be node with descendants) into it via the append method, let's say it will be like this:

<section>
  <div id="1">
    <title>Title 1</title>
    <component-description>
      #shadow-root
        <span>Lorem ipsum</span>
    </component-description>
  </div>
  <div id="2">
    <title>Title 2</title>
    <component-description>
      #shadow-root
        <span>Lorem ipsum</span>
    </component-description>
  </div>
  <div id="3">
    <title>Title 3</title>
    <component-description>
      #shadow-root
        <span>Lorem ipsum</span>
    </component-description></div>
</section>

So when the insertion algorithm is run, when we enter the tree traversal cycle step 7, do some steps for the section node (specified above), then meet step 7.7 and traverse all nodes that contain section and its children including shadow ones. Is it?

Okay, now when we return to step 7, we get a new node that is a child of section, this is div with id="1". Now some steps will be performed for it, and then step 7.7, which will again traverse all nodes for div.

If we analyze the traversal by nodes, then div with id="1" and some of its children will be traversed by step 7.7 several times. For what?

Main question: Why use For each node in nodes, in tree order when the clause For each shadow-including inclusive descendant inclusiveDescendant of node, in shadow-including tree order is much broader and traverse shadow nodes?

I understand the difference between the traversal cycles (one includes traversal of shadow trees, the other does not). Probably I am missing something. I need help.


There is an even stranger part of this same algorithm below.

  1. For each node of nodes, in tree order:
    1. For each shadow-including inclusive descendant inclusiveDescendant of node, in shadow-including tree order, append inclusiveDescendant to staticNodeList.

Why is this done? I'm probably missing something.


I wrote a little example that demonstrates that some nodes will be visited a several times:

let wrapper = document.createElement("div");
wrapper.setHTMLUnsafe(`<section>
  <div id="1">
    <title>Title 1</title>
    <component-description>
        <template shadowrootmode="open">
            <span>Lorem ipsum</span>
        </template>
    </component-description>
  </div>
  <div id="2">
    <title>Title 2</title>
    <component-description>
        <template shadowrootmode="open">
            <span>Lorem ipsum</span>
        </template>
    </component-description>
  </div>
  <div id="3">
    <title>Title 3</title>
    <component-description>
        <template shadowrootmode="open">
            <span>Lorem ipsum</span>
        </template>
    </component-description></div>
</section>`)
let targetNode = wrapper.firstElementChild;

/// This is step 7 in spec
function traverseTree(node, shadows){
  if (node.visited === undefined) node.visited = 0;
  ++node.visited;
  node.setAttribute?.("visited", node.visited);
  console.log(node, "=>", node.visited);

  if (typeof shadows === "function") shadows(node);

  for (let child = node.firstChild; child; child = child.nextSibling){
      traverseTree(child, shadows);
  }
}

function traverseTreeWithShadows(node){
  if (node.visited === undefined) node.visited = 0;
  ++node.visited
  node.setAttribute?.("visited", node.visited);

  if (node.shadowRoot) traverseTreeWithShadows(node.shadowRoot);
  for (let child = node.firstChild; child; child = child.nextSibling) {
      traverseTreeWithShadows(child);
  }
}

traverseTree(targetNode, traverseTreeWithShadows)


Solution

  • Two things you got confused about here:

    In your scenario nodes is actually just « node », i.e. a list with a single item node (the <section> element). The case where nodes size is greater than one is when node was a DocumentFragment.

    1. Let nodes be node’s children, if node is a DocumentFragment node; otherwise « node ».

    So the outer loop actually loops only once.

    Then, it seems you thought the insertion steps would somehow reenter this algorithm? It won't. The insertion steps are different algorithms entirely. It's basically callbacks that are to be ran for some elements (like loading an <iframe>'s content, for instance).

    So until step 7, only the <section> element has been visited by the outer loop, and the insertion steps of each of its descendants have been called.

    Then the step 11 does loop once more over the descendants in order to get a static list of descendant nodes. To paraphrase the note just above, it's done in order to avoid having a live list when firing the post-connection-steps because these steps can execute code which could modify the tree and thus affect the live list. Having a static one, we're sure every descendant gets its post-connection-steps fired only once.