Search code examples
pythonscrapy

Scraping from IMDB with Scrapy


I'm scraping data from the IMDB Top 250 Movies page, specifically trying to retrieve user voting information.

The HTML structure for user voting is like this:

" ("
"3.8M"
")"

So, when I extract this information using the CSS selector: response.css('.ipc-rating-star--voteCount::text').getall() I get a list with the first parenthesis, the vote and the last parenthesis

DataFrame with IMDB data retrieved

I want to extract only the numeric part (the user voting value) from this structure and not include the parentheses

This is my whole code:

movie_data = []

class IMDBSpider(scrapy.Spider):
    name = 'imdb_spider'

    def start_requests(self):
        headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
        yield scrapy.Request(url, headers=headers, callback=self.parse)
            
    def parse(self, response):
        # movie name
        movie_name = response.css('ul.ipc-metadata-list h3.ipc-title__text::text').getall()

        # movie year
        movie_year = response.css('.sc-c7e5f54-8.hgjcbi.cli-title-metadata-item:first-child::text').getall()

        # movie ratings
        movie_rating = response.css('ul.ipc-metadata-list span.ipc-rating-star--base::text').getall()

        # user votings
        user_vote = response.css('.ipc-rating-star--voteCount::text').getall()

        print(response.css('.ipc-rating-star--voteCount::text').getall())

        for name, year, rating, votes in zip(movie_name, movie_year, movie_rating, user_vote):
            self.log(f'Processing: {name}, {year}, {rating}, {votes}')
            relevant_elements = self.extract_combined_str(votes)
            self.log(f'Relevant Elements: {relevant_elements}')
            movie_dict = {
            'movie_name': self.extract_name(name),  # Call the extract_name method using self
            'movie_year': year.strip(),
            'movie_rating': rating.strip(),
            'user_votes': [vote for vote in votes]
            }
            movie_data.append(movie_dict)


        delay = random.uniform(2, 5)
        self.log(f'Delaying for {delay} seconds.')
        time.sleep(delay)

    def extract_name(self, name):
        name = name.strip()
        name = re.sub(r'^\d+\.\s*', '', name)
        return name
    
    def extract_combined_str(self, votes):
        # Filtra solo los elementos que contienen dígitos y paréntesis
        numeric_value = re.search(r'\(([\d.]+[MK]?)\)', votes)
        return numeric_value.group(1) if numeric_value else None

As I've mentioned, I've tried to extract the user votings and remove the parentheses but it's not working as it can be seen in the image.


Solution

  • You can use get all on each of the individual ratings span elements and then simply slice out the center index and and call .strip('()"') to get rid of the remaining extra characters from the segment.

    This is much easier to do when you iterate through each of the movie sections row by row instead of collecting them all at once.

    For example:

    class IMDBSpider(scrapy.Spider):
        name = 'imdb_spider'
    
        def start_requests(self):
            headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
            yield scrapy.Request("https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK222QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3", headers=headers, callback=self.parse)
    
        def parse(self, response):
            # iterate movie sections
            for movie in response.css(".cli-parent"):
                # movie name
                movie_name = movie.css('h3::text').get()
                # movie year
                movie_year = movie.css('.cli-title-metadata-item:first-child::text').get()
                # movie ratings
                movie_rating = movie.css('.ipc-rating-star--base::text').get()
                # user votings
                user_vote = movie.xpath('.//span[@class="ipc-rating-star--voteCount"]//text()').getall()
                user_vote = user_vote[1].strip('"()')
                self.log(f'Processing: {movie_name}, {movie_year}, {movie_rating}, {user_vote}')
                movie_dict = {
                    'movie_name': self.extract_name(movie_name),
                    'movie_year': movie_year.strip(),
                    'movie_rating': movie_rating,
                    'user_votes': user_vote
                }
                self.log(f'Relevant Elements: {movie_dict}')
                movie_data.append(movie_dict)
            delay = random.uniform(2, 5)
            self.log(f'Delaying for {delay} seconds.')
            time.sleep(delay)
    
        def extract_name(self, name):
            name = name.strip()
            name = re.sub(r'^\d+\.\s*', '', name)
            return name
    

    OUTPUT

    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': "Pan's Labyrinth", 'movie_year': '2006', 'movie_rating': '8.2', 'user_votes': '692K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Jurassic Park', 'movie_year': '1993', 'movie_rating': '8.2', 'user_votes': '1M'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'The Sixth Sense', 'movie_year': '1999', 'movie_rating': '8.2', 'user_votes': '1M'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Unforgiven', 'movie_year': '1992', 'movie_rating': '8.2', 'user_votes': '428K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'A Beautiful Mind', 'movie_year': '2001', 'movie_rating': '8.2', 'user_votes': '968K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'The Treasure of the Sierra Madre', 'movie_year': '1948', 'movie_rating': '8.2', 'user_votes': '130K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'No Country for Old Men', 'movie_year': '2007', 'movie_rating': '8.2', 'user_votes': '1M'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Yojimbo', 'movie_year': '1961', 'movie_rating': '8.2', 'user_votes': '129K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Kill Bill: Vol. 1', 'movie_year': '2003', 'movie_rating': '8.2', 'user_votes': '1.2M'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'The Thing', 'movie_year': '1982', 'movie_rating': '8.2', 'user_votes': '453K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Monty Python and the Holy Grail', 'movie_year': '1975', 'movie_rating': '8.2', 'user_votes': '561K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'The Great Escape', 'movie_year': '1963', 'movie_rating': '8.2', 'user_votes': '254K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Finding Nemo', 'movie_year': '2003', 'movie_rating': '8.2', 'user_votes': '1.1M'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Rashomon', 'movie_year': '1950', 'movie_rating': '8.2', 'user_votes': '177K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'The Elephant Man', 'movie_year': '1980', 'movie_rating': '8.2', 'user_votes': '253K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Chinatown', 'movie_year': '1974', 'movie_rating': '8.2', 'user_votes': '342K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': "Howl's Moving Castle", 'movie_year': '2004', 'movie_rating': '8.2', 'user_votes': '429K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Dial M for Murder', 'movie_year': '1954', 'movie_rating': '8.2', 'user_votes': '185K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Gone with the Wind', 'movie_year': '1939', 'movie_rating': '8.2', 'user_votes': '328K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'V for Vendetta', 'movie_year': '2005', 'movie_rating': '8.2', 'user_votes': '1.2M'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Raging Bull', 'movie_year': '1980', 'movie_rating': '8.1', 'user_votes': '372K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Prisoners', 'movie_year': '2013', 'movie_rating': '8.1', 'user_votes': '778K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Lock, Stock and Two Smoking Barrels', 'movie_year': '1998', 'movie_rating': '8.1', 'user_votes': '605K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'The Secret in Their Eyes', 'movie_year': '2009', 'movie_rating': '8.2', 'user_votes': '218K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Inside Out', 'movie_year': '2015', 'movie_rating': '8.1', 'user_votes': '760K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Spider-Man: No Way Home', 'movie_year': '2021', 'movie_rating': '8.2', 'user_votes': '842K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Three Billboards Outside Ebbing, Missouri', 'movie_year': '2017', 'movie_rating': '8.1', 'user_votes': '540K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'The Bridge on the River Kwai', 'movie_year': '1957', 'movie_rating': '8.1', 'user_votes': '230K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Trainspotting', 'movie_year': '1996', 'movie_rating': '8.1', 'user_votes': '713K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Fargo', 'movie_year': '1996', 'movie_rating': '8.1', 'user_votes': '707K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Warrior', 'movie_year': '2011', 'movie_rating': '8.1', 'user_votes': '490K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Gran Torino', 'movie_year': '2008', 'movie_rating': '8.1', 'user_votes': '802K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Catch Me If You Can', 'movie_year': '2002', 'movie_rating': '8.1', 'user_votes': '1.1M'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'My Neighbor Totoro', 'movie_year': '1988', 'movie_rating': '8.1', 'user_votes': '365K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Million Dollar Baby', 'movie_year': '2004', 'movie_rating': '8.1', 'user_votes': '710K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Klaus', 'movie_year': '2019', 'movie_rating': '8.2', 'user_votes': '174K'}
    2023-11-10 23:47:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.imdb.com/chart/top/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=470df400-70d9-4f35-bb05-8646a1195842&pf_rd_r=5V6VAGPEK22
    2QB9E0SZ8&pf_rd_s=right-4&pf_rd_t=15506&pf_rd_i=toptv&ref_=chttvtp_ql_3>
    {'movie_name': 'Harry Potter and the Deathly Hallows: Part 2', 'movie_year': '2011', 'movie_rating': '8.1', 'user_votes': '924K'}