Search code examples
htmlxpathscrapyscraper

XPath recursive children selection


I'm using scrapy to extract data from a web site, but I have a problem with the XPath selector, assuming i have this HTML code:

<div id="_parent">
    Hi!
    <p>I am a child!</p>
    <span class="someclass">I am a <b>span</b> child!</span>
</div>

what I get:

I am a child
I am a  child!

what I should get:

Hi!
I am a child!
I am a span child!

The XPath I am using is the following: .//div[@id="_parent"]//*/text() I know this is because is not a direct children of the #_parent div but how can I recursively get all the children?


Solution

  • You can just use: .//div[@id="_parent"]//text() to fetch all text node children of the selected node. You can test it here.