Search code examples
xpathweb-scrapingscrapymeta-tags

Extracting keywords from metatag using scrapy


I'm trying to use scrapy to download some content for a school project. I would like to get a list of keywords for each page that i can then store in a database. This is what i've got so far.

scrapy shell http://news.nationalgeographic.com/2015/03/150318-pitcairn-marine-reserve-protected-area-ocean-conservation/

>>> response.xpath('//title/text()').extract()

[u'World\u2019s Largest Single Marine Reserve Created in Pacific']

>>> response.xpath("//meta[@name='keywords']")[0].extract()

u'<meta name="keywords" content="ocean life, conservationists, marine biodiversity, marine sanctuaries, wildlife conservation, marine protected areas, mpas, reserves, sanctuaries, ocean conservation">'

What i'd like to do is just extract the content from the meta tag where name='keywords'

Thanks!


Solution

  • Simply add /@content to extract the content attribute :

    response.xpath("//meta[@name='keywords']/@content")[0].extract()