Search code examples
javascriptcssxpathscrapyhref

How to select inner text of the link using XPath?


I am using Scrapy to crawl data.

On JS console on my browser, I type $x('//div[@class="summary"]//div[contains(@class, "tags")]') to get what I need, but I need to filter the data.

The following picture is the $x('//div[@class="summary"]//div[contains(@class, "tags")]') command result.

JS console result

How should I write xpath command to get the data in the green box? I tried $x('//div[@class="summary"]//div[contains(@class, "tags")]//a[contains(@class, "post-tag")]'), but that is not what I want。

Thank you!


Solution

  • To select inner text of <a> element within the selected div, you only need to append /a/text() to your XPath which selects the div :

    //div[@class="summary"]//div[contains(@class, "tags")]/a/text()