Search code examples
pythonweb-scrapingscrapypyspider

Can't get all info from a page with scrapy


I am making a spider to get weather data from weather.com. I have made a for loop to iterate through a list of <a> with the data I want, then with in the loop I pull out my info. But the loop only iterates through once.

Why is this happening? Why is my loop not iterating through all the values I want it to?

class WeatherSpider(scrapy.Spider):
    name = "weather"
    allowed_domains = ["weather.com"]
    start_urls = ["https://weather.com/weather/tenday/l/Homewood+AL?canonicalCityId=ee632098bb6c46fd10d48efa5cf1550a9e8a2d593da04926653f01e690d40ba2"]

    def parse(self, response):
        weather_item = WeatherApiItem()
        x = response.css('div.DailyForecast--DisclosureList--nosQS')
        
        for i in x:
            print(1)
            weather_item['Hight_temp'] = i.css('span.DetailsSummary--highTempValue--3PjlX ::text').get()
            weather_item['Low_temp'] = i.css('span.DetailsSummary--lowTempValue--2tesQ ::text').get()
            weather_item['Rain_chance'] = i.xpath('//*[@id="detailIndex1"]/summary/div/div/div[3]/span/text()').get()
        yield weather_item

Solution

  • I have made some changes to your code, and everything looks fine now. Below is the updated version of your code.

    class WeatherSpider(scrapy.Spider):
        name = "weather"
        allowed_domains = ["weather.com"]
        start_urls = ["https://weather.com/weather/tenday/l/Homewood+AL?canonicalCityId=ee632098bb6c46fd10d48efa5cf1550a9e8a2d593da04926653f01e690d40ba2"]
    
        def parse(self, response):
            for result_sel in response.css('.DailyForecast--DisclosureList--nosQS details'):
                weather_item = WeatherApiItem()
                weather_item['hight_temp'] = result_sel.css('[class*="DetailsSummary--highTempValue"] ::text').get()
                weather_item['low_temp'] = result_sel.css('[class*=DetailsSummary--lowTempValue] ::text').get()
                weather_item['rain_chance'] = result_sel.css('[data-testid="PercentageValue"]::text').get()
    
                yield weather_item