Search code examples
pythonimdbpy

imdbpy - cannot get the episode ids


I'm a beginner at Python.

I am trying to get some information about a movie using Imdbpy. The code looks like this:

ia = IMDb()
results = ia.search_movie(movie_name)
movie_id = results[0].getID()
movie = ia.get_movie(movie_id)

and if the movie_name is a series name I am going to get the episodes ids:

episode_ids = ia.get_movie_episodes(movie)

but I get an error that I do not know how to fix it: the error:

Traceback (most recent call last):
  File "e:\python\file_name\film_names.py", line 333, in 
<module>
    writing_imdb_S(film_name, path,file_oldname_need, prefix_and_sufix)
  File "e:\python\file_name\film_names.py", line 146, in 
writing_imdb_S
    episode_ids = ia.get_movie_episodes(movie)
  File "C:\Users\mahdi\AppData\Roaming\Python\Python310-32\site-packages\imdb\parser\http\__init__.py", line 631, 
in get_movie_episodes
    cont = self._retrieve(self.urls['movie_main'] % movieID + 'episodes')
  File "C:\Users\mahdi\AppData\Roaming\Python\Python310-32\site-packages\imdb\parser\http\__init__.py", line 392, 
in _retrieve
    ret = self.urlOpener.retrieve_unicode(url, size=size)  File "C:\Users\mahdi\AppData\Roaming\Python\Python310-32\site-packages\imdb\parser\http\__init__.py", line 233, in retrieve_unicode
    response = uopener.open(url)
  File "C:\Program Files (x86)\Python310-32\lib\urllib\request.py", line 519, in open
    response = self._open(req, data)
  File "C:\Program Files (x86)\Python310-32\lib\urllib\request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +    
  File "C:\Program Files (x86)\Python310-32\lib\urllib\request.py", line 496, in _call_chain
    result = func(*args)
  File "C:\Program Files (x86)\Python310-32\lib\urllib\request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,  File "C:\Program Files (x86)\Python310-32\lib\urllib\request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,        
  File "C:\Program Files (x86)\Python310-32\lib\http\client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)      
  File "C:\Program Files (x86)\Python310-32\lib\http\client.py", line 1293, in _send_request
    self.putrequest(method, url, **skips)
  File "C:\Program Files (x86)\Python310-32\lib\http\client.py", line 1127, in putrequest
    self._validate_path(url)
  File "C:\Program Files (x86)\Python310-32\lib\http\client.py", line 1227, in _validate_path
    raise InvalidURL(f"URL can't contain control characters. {url!r} "  
http.client.InvalidURL: URL can't contain control characters. '/title/ttThe Last of Us/episodes' (found at least ' ')

(I just pasted it idk which part is important.)

I did not do anything about it because I don't really know about urls and request and imdbpy it self. I would appreciate it if you help me about it.

I don't know if you need these but I'm trying to get these information: ['title', 'year', 'rating', 'season', 'episode_number', 'episode_title', 'episode_rating', 'genres']


Solution

  • I think the correct way to get the episodes, by the doc, is:

    The episodes of a series can be fetched using the “episodes” infoset. This infoset adds an episodes key which is a dictionary from season numbers to episodes. And each season is a dictionary from episode numbers within the season to the episodes. Note that the season and episode numbers don’t start from 0; they are the numbers given by the IMDb:

    >>> ia.update(series, 'episodes')
    >>> sorted(series['episodes'].keys()) # the keys here are actually season number!
    [1, 2, 3, 4]
    >>> season4 = series['episodes'][4]
    >>> len(season4)
    13
    >>> episode = series['episodes'][4][2]
    >>> episode
    <Movie id:1038701[http] title:_"The 4400" Fear Itself (2007)_>
    >>> episode['season']
    4
    >>> episode['episode']
    

    But be careful that the series object here is the same as your movie object. You can find more information on the doc: https://cinemagoer.readthedocs.io/en/latest/usage/series.html#series