Search code examples

How to skip paragraphs with comments in XPath expression?

I'm trying to scrape websites like this with the following Xpath expression:

.//div[@class="tresc"]/p[not(starts-with(text(), "<!--"))]

The thing is that the first paragraph is a comment section, so I'd like to skip it:

<!--[if gte mso 9]><xml>
<w:PunctuationKerning />
<w:ValidateAgainstSchemas />
<w:BreakWrappedTables />
<w:SnapToGridInCell />
<w:WrapTextWithPunct />
<w:UseAsianBreakRules />
<w:DontGrowAutofit />

Unfortunately, my expression does not skip the paragraph with comments. Anyone know what I'm doing wrong?


  • Comments are not part of text(), they constitute a node of their own: comment(). To exclude p's that contain comments, use
