In my XML, I have comments like <!--INS--><!--/INS--><!--DEL--><!--/DEL-->
; I want to ignore any matching text within it when I'm searching for specific substrings.
For example, my XML file has:
<p>XXXX YYYY ZZZZ
<!--INS-->,,<!--/INS-->
<!--DEL-->..<!--/DEL-->
AAA BBB CCC DDD..
</p>
I want to find the double dot elements (p tag) but need to ignore double dot within both "INS" and "DEL" tags.
I have tried my xpath
//p[contains(.,'..') and descendant::comment()[not(contains(.,'..'))]]
but it is not working. How can I do this in Xpath code?
Your ".." and ",," are not inside comment()'s, they are in text()-nodes between comment()'s. So if understand correctly you need this(wrong assumption see EDIT):
//p[ends-with(normalize-space(),'..') and not(comment()[contains('INSDEL',.) and following-sibling::node()[1][self::text()[.='..']]])]
This wil not match your example.
Explanation of this part:
following-sibling::node()[1][self::text()
It will select the text()-node that is direct following of that comment.
If you want that only this will not match (both ..)(also wrong assumption see EDIT)
<p>XXXX YYYY ZZZZ
<!--INS-->..<!--/INS-->
<!--DEL-->..<!--/DEL-->
AAA BBB CCC DDD..
</p>
You need:
//p[ends-with(normalize-space(),'..') and
not(comment()[.='INS' and following-sibling::node()[1][self::text()[.='..']]]
and comment()[.='DEL' and following-sibling::node()[1][self::text()[.='..']]])]
EDIT:
The following XPath:
//p[text()[not(preceding-sibling::node()[1][self::comment()=('INS','DEL') ] ) and contains(.,'..')]]
will match this example:
<p>Save at least 15% on local breaks, longer trips, or anything in between.. Plan your next getaway for less.
<!--INS-->Book between Mar.. 15 - 31<!--/INS-->
<!--DEL-->Stay between May. 15-31<!--/DEL--> Getaway Deals. </p>
because the two dots are in text()-nodes that are not between comments
But wil not match
<p>Save at least 15% on local breaks, longer trips, or anything in between. Plan your next getaway for less.
<!--INS-->Book between Mar.. 15 - 31<!--/INS-->
<!--DEL-->Stay between May. 15-31<!--/DEL--> Getaway Deals. </p>
Because the only double dots are in text()-nodes between the comments