Search code examples
pythonscrapy

How do you count specific tags within a parent tag, using scrapy


In a site that I am trying to scrape, each page has 6 tables, and within each table an image tag will be repeated between 1 and 5 times, and I want to count how many times the image tag appears in each table.

The tables are identified by @data-trap; @data-trap = '1', @data-trap = '2', etc.

Below is an example of code I have unsuccessfully tried:

for products in response.xpath('*//tbody//*'):
    if products.xpath('tbody [@data-trap = '1']/../@src').get() == '/greyhound-racing/img/icon/star-blue.png':
        s += 1

The error message in scrapy shell is > SyntaxError: invalid syntax. Perhaps you forgot a comma?

Hence s should take a value between 1 and 5, depending on the table. Where am I going wrong?


Solution

  • The culprit is your use of apostrophes for your strings. See here:

    if products.xpath('tbody [@data-trap = '1']/../@src').get()

    Because you are using only ', 1 gets excluded. There are two alternatives:

    1. products.xpath("tbody [@data-trap = '1']/../@src").get() # Quotation marks

    2. products.xpath('tbody [@data-trap = \'1\']/../@src') # Escaping