Search code examples
htmlxpathscrapytagsaxes

Get text of a tag and the text of child tags


I have this HTML

<p>
        <strong>aquiline</strong>
        <i> adj. </i>
        of or like the eagle.
</p>

All this this node is wrapped by a div with class= field-item even

I would like to recive Aquiline adj. of or like the eagle.... Now i have this uncorrect xpath response.xpath('//div[@class="field-item even"]//descendant-or-self::p/text()').getall()


Solution

  • Your xpath is almost correct. Replace p with * to select all text nodes and not only text nodes of paragraph tags. Also using normalize-space function you can get all the text as one string instead of a list. See below code snippet.

    response.xpath('normalize-space(//div[@class="field-item even"]//descendant-or-self::*)').get()