Search code examples
jsonweb-scrapingscrapyscreen-scrapingscraper

Scraping Json Data from a REST Api


I am learning Firebase with Android and I need a database to play with. This is the Json request url :https://yts.ag/api/v2/list_movies.json . It contains around 5000 movie List that I need. So I searched around internet and I found a tool called Scrapy. But I have no idea how to use it in a rest API. Any Help is appreciated


Solution

  • First you'll need to follow the Scrapy Tutorial to create a scrapy project, and then your spider can be as simple as this:

    class MySpider(Spider):
        name = 'myspider'
    
        start_urls = ['https://yts.ag/api/v2/list_movies.json']
    
        def parse(self, response):
            json_response = json.loads(response.body)
            for movie in json_response['data']['movies']:
                yield Request(movie['url'], callback=self.parse_movie)
    
        def parse_movie(self, response):
            # work with every movie response
            yield {'url': response.url}