Search code examples
pythonpandasnlpnltk

How to display the value that in the same row which matched with the input?


The data has 2 columns as title and genre. So I am trying to give the title value of the row which matched by genre with user input.

Here what i try:

#CSV READ & GENRE-TITLE
data = pd.read_csv("data.csv")
df_title = data["title"]
df_genre = data["genre"]

#TOKENIZE
tokenized_genre = [word_tokenize(i) for i in df_genre]
tokenized_title = [word_tokenize(i) for i in df_title]

#INPUT-DATA MATCH
search = {e.lower() for l in tokenized_genre  for e in l}
choice = input('Please enter a word = ')

while choice != "exit":
    if choice.lower() in search:
        print(data.loc[data.genre == {choice}, 'title'])
    else:
        print("The movie of the genre doesn't exist")
    choice = input("Please enter a word = ")

But the result is: Series([], Name: title, dtype: object)

How can i solve it ?

Edit: Data samples for title

0                              The Story of the Kelly Gang
1                                           Den sorte drøm
2                                                Cleopatra
3                                                L'Inferno
4        From the Manger to the Cross; or, Jesus of 
...

And for genres:

0          Biography, Crime, Drama
1                            Drama
2                   Drama, History
3        Adventure, Drama, Fantasy
4                 Biography, Drama
...

Solution

  • One proposal only based on Pandas

    I would suggest something like this (please adapt to your situation upon your wishes, it's only some general guidelines and hints from where you can start):

    import pandas as pd
    
    # Warning: there are coma and semi-column in some of the films titles,
    # so I had to use an other separator when exporting data to CSV, 
    # here I decided to chose the vertical bar '|' as you can see)
    
    #CSV READ & GENRE-TITLE
    data = pd.read_csv("data.csv", sep="|")
    
    choice = input('Please enter a word = ')
    
    while choice != "exit":
        choice = choice.lower()
        for index, row in data.iterrows():
            if choice in row['genre'].lower():
                print(row['title'])
            else:
                print(("The movie of the genre {} doesn't exist").format(choice))
        choice = input("Please enter a word = ")
    


    Edit

    To generate a random number:

    from random import randint
    i = randint(0, len(data))
    

    Then, use i as the index to search within your DataFrame.
    I let you play around with this.



    Useful links

    Does Python have a string 'contains' substring method?
    How to iterate over rows in a DataFrame in Pandas?