Search code examples
pythonscrapyscrapy-shell

How to extract section via xpath out of source code in scrapy?


I am trying to extract text out of the source code of a site in a section.

The source code of the site I am trying to extract looks like:

if ('function' === typeof window.ToggleFilters) {
    window.ToggleFilters();
}
</script>

<main id="main" data-danger="">

<section data-creation-date="2018-10-15 11:35:06">

    <div class="detail__content">

I have tried through response.css and response.xpath to try to get the data out of the source code with no luck via scrapy shell.

response.xpath("//*[contains('data-creation')]")

I would like to extract just the data-creation-date so it would look like

'2018-10-15 11:35:06'

Solution

  • response.css('#main section::attr("data-creation-date")').extract_first()
    

    or

    response.xpath("//@data-creation-date").extract_first()
    

    or

    response.xpath("//main/section/@data-creation-date").extract_first()