Search code examples
xpath

How can I count pairs of matching descendant nodes?


I want to count the number of <C> nodes that match a certain <A> node.

Example tree with desired answer 2:

<e>
  <A> txt </A>
  <e>
    <A> txt </A>
    <B>
      <C> txt </C>
      <C> txt </C>
    </B>
  </e>
</e>

Example tree with desired answer 0:

<e>
  <A> txt </A>
  <e>
    <A> NO </A>
    <B>
      <C> txt </C>
      <C> txt </C>
    </B>
  </e>
</e>

The path should be relative to //e/C. I've tried:

  • //e/e[count(A = .//C)] -- not valid XPath since we can only count() nodes, not logical conditions
  • //e/e[count(A[self::* = .//C])] -- wrong. Only counts the A nodes, so gives 1, not 2, in the first case
  • //e/e[count(.//C[self::* = ancestor::*/preceding-sibling::A])] -- wrong. Includes the <A> node which is an ancestor of the "inner" <e>, giving 1, which is incorrect.

The actual trees are more complicated, of course, so constructing a condition on ancestor::* will be hard -- any risk of matching ancestors of the "inner" <e> is likely to produce false positives.

Is there some syntax/function I'm missing that can help to count matching pairs like here?


Solution

  • This XPath gets the expected count based on the provided sample

    count(//e/e//C[. = preceding::A[1]])
    

    Could be read as: "count C nodes descendants of an e node that are equal to the first preceding A node"

    Given this sample

    <e>
      <A> txtNO </A>
      <e>
        <A> txt </A>
        <B>
          <C> txt </C>
          <C> txtNO </C>
          <C> txt </C>
        </B>
      </e>
    </e>
    

    Result

    xmllint --xpath 'count(//e/e//C[. = preceding::A[1]])' tmp.xml
    2
    

    Testing the second preceding node with count(//e/e//C[. = preceding::A[2]] would returned 1 since textNO nodes would have been matched.