Search code examples
pythonweb-scrapingbeautifulsouppython-requestsimdb

Download only the required part of a webpage on IMDB using Python


I am trying to retrieve images of movies from IMDB by going through a csv file with movie names and downloading the movie cover pictures and storing it locally. Instead of downloading the entire webpage and then selecting the required part(image element).

Is there a way to just find the "get" request the browser sent to retrieve the image ?

I was able to get the Url but there does not seem to be a pattern so as to iterate over a loop and continuously download images.

This is the get request for toy story 1:

enter image description here

This is the get request for toy story 3:

enter image description here

I was able to remove all the characters after "@" and still get the image as they are the sizing option for the image.


Solution

  • You can use OMDB API. By querying that API, you get tons of information in a JSON response, including a link to its cover image. For example, searching for Toy Story:

    {"Title":"Toy Story",

    "Year":"1995",

    ...

    "Poster":"https://images-na.ssl-images-amazon.com/images/M/MV5BMDU2ZWJlMjktMTRhMy00ZTA5LWEzNDgtYmNmZTEwZTViZWJkXkEyXkFqcGdeQXVyNDQ2OTk4MzI@._V1_SX300.jpg",

    ...

    "Response":"True"}

    I've used it for my Movie Indexer, albeit in Java, if you wanna check out how it works.