Search code examples
pythondictionaryedge-list

create an edge list that groups films by genre, i.e. join two films of the same genre


I've just been using python and I want to build an edge list that groups together the titles of movies that have a genre in common. I have this dictionary:

dictionary_title_withonegenere=
{28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
16: ['Puss in Boots: The Last Wish', 'Strange World']}

now 28,12,16 are the genres of movies.I want to create an edge list that groups movies by genre, i.e. I join two movies of the same genre:

source                         target 
Avatar: The Way of Water       Violent Nigh
Avatar: The Way of Water       Puss in Boots: The Last Wish
Violent Nigh                   Puss in Boots: The Last Wish
Avatar: The Way of Water       The Chronicles of Narnia: The Lion, the Witch 
                               and the Wardrobe
Puss in Boots: The Last Wish   Strange World

This is my idea:

edges=[]
genres=[28,12,16]

    for i in range(0,len(genres)):
            for genres[i] in dictionary_title_withonegenere[genres[i]]:
                for genres[i] in dictionary_title_withonegenere[genres[i]][1:]:
                    edges.append({"sorce":dictionary_title_withonegenere[genres[i]][0],"target":dictionary_title_withonegenere[genres[i]][y]})

    print((edges))

My code don't work. How can i do?


Solution

  • You can check if 2 movies have common genre by creating an intermediate datastructure, that is to have a mapping with movie->genres and with that datastructure, you can iterate over all movies and see if there is any common genre and create an edge between them.

    from pprint import pprint
    dictionary_title_withonegenere= {28: ['Avatar: The Way of Water', 'Violent Night', 'Puss in Boots: The Last Wish'],
    12: ['Avatar: The Way of Water', 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'],
    16: ['Puss in Boots: The Last Wish', 'Strange World']}
    
    movies_with_genre = {}
    movies_set = set()
    for genre, movies in dictionary_title_withonegenere.items():
        for movie in movies:
            movies_with_genre.setdefault(movie, set()).add(genre)
            movies_set.add(movie)
        
    pprint(movies_with_genre)
    movie_list = list(movies_set)
    edges = []
    for i in range(len(movie_list)):
        source_movie= movie_list[i]
        for j in range(i + 1, len(movie_list)):
            target_movie = movie_list[j]
            common_genre = False
            for source_genre in movies_with_genre[source_movie]:
                if source_genre in movies_with_genre[target_movie]:
                    common_genre = True
                    break
            if common_genre:
                edges.append({"sorce":source_movie, "target":target_movie})
    pprint(edges)
    

    OUTPUT

    {'Avatar: The Way of Water': {28, 12},
     'Puss in Boots: The Last Wish': {16, 28},
     'Strange World': {16},
     'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe': {12},
     'Violent Night': {28}}
    [{'sorce': 'Strange World', 'target': 'Puss in Boots: The Last Wish'},
     {'sorce': 'Avatar: The Way of Water',
      'target': 'Puss in Boots: The Last Wish'},
     {'sorce': 'Avatar: The Way of Water', 'target': 'Violent Night'},
     {'sorce': 'Avatar: The Way of Water',
      'target': 'The Chronicles of Narnia: The Lion, the Witch and the Wardrobe'},
     {'sorce': 'Puss in Boots: The Last Wish', 'target': 'Violent Night'}]