Search code examples
.netxpathwordml

Difference between these XPath expressions inside a Word Processing Markup Language?


In Word Processing Markup Language (WordML) for Microsoft Office Word 2003, is there any difference between the following two XPath expressions:

@".//w:p[count(*/w:t) >= 0]"

and

@".//w:p"

I am confused because if the count is Zero, then what is the importance of the expression in the angular brackets? Doesn't both select the same number of nodes?

Also, does */w:t select only that are grandchildren or will it take into account the immediate children too?


Solution

  • This expression:

    ".//w:p[count(*/w:t) >= 0]"
    

    selects all descendants of the context node (., whatever it is in the current state of processing) if they are elements with the qualified name w:p and only if their children (most likely w:r elements) have at least 0 w:telements.

    That does not make a lot of sense of course, but this would:

    ".//w:p[count(*/w:t) >= 1]"
    

    But actually, this would be sufficient:

    ".//w:p[descendant::w:t]"
    

    The rationale behind it is to select w:pelements only if they contain text (which is stored in w:t elements, which in turn is stored in w:r elements ("runs").


    On the other hand,

    ".//w:p"
    

    selects all w:p elements that are descendants of the context node, regardless of whether they in turn contain a w:t descendant or not.


    EDIT

    Doesn't both select the same number of nodes?

    Yes, both amount to the same, but one of them is a sensible expression whereas the other is not.

    does */w:t select only that are grandchildren or will it take into account the immediate children too?

    This expression will only take into account w:t elements that are grandchildren of w:p. Besides, this (i.e. w:tas an immediate child of w:p) is not allowed in the OOXML Schema.