Search code examples
pythoncsvartificial-intelligencechatbot

"IndexError: list index out of range" When creating an automated response bot


Im creating a Chatbot which uses questions from a CSV file and checks similarity using SKlearn and NLTK, However im getting an error if the same input is entered twice:

This is the main code that takes the user input and outputs an answer to the user:

import pandas as pd
data=pd.read_csv('FootballQA.csv')
question=data['Q'].tolist()
answer=data['A'].tolist()

lemmer = nltk.stem.WordNetLemmatizer()


#WordNet is a semantically-oriented dictionary of English included in NLTK.
def LemTokens(tokens):
    return [lemmer.lemmatize(token) for token in tokens]
remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)

def LemNormalize(text):
    return LemTokens(nltk.word_tokenize(text.lower().translate(remove_punct_dict)))



GREETING_INPUTS = ("hello", "hi", "greetings", "sup", "what's up","hey","how are you")
GREETING_RESPONSES = ["hi", "hey", "hi there", "hello", "I am glad! You are talking to me"]
def greeting(sentence):
 
    for word in sentence.split():
        if word.lower() in GREETING_INPUTS:
            return random.choice(GREETING_RESPONSES)


GI = ("how are you")
GR = ["i'm fine","good,how can i help you!"]
def greet(sentence):
 
    for word in sentence.split():
        if word.lower() in GREETING_INPUTS:
            return random.choice(GREETING_RESPONSES)

    def responses(user):
        response=''
        question.append(user)
        TfidfVec = TfidfVectorizer(tokenizer=LemNormalize, stop_words='english')
        tfidf = TfidfVec.fit_transform(question)
        val = cosine_similarity(tfidf[-1], tfidf)
       
        id1=val.argsort()[0][-2]
        flat = val.flatten()
        flat.sort()
        req = flat[-2]
     
        if(req==0):
            robo_response=response+"I am sorry! I don't understand you"
            return robo_response 
        else:
            response = response+answer[id1]
            question.remove(user)
            return response       
    
    command=1
    while(command):
        v = input("Enter your value: ") 
        if(v=="exit"):
            command=0
        else:
            print(responses(str(v))) 

When the program runs it asks the user for their input however the problem happens if the same input is entered twice, if i enter "football" it will first correctly display the output i want but then a second time will stop the program and im given this error:

Enter your value: scored
Alan shearer holds the goal record in the premier league.

Enter your value: football
I am sorry! I don't understand you

Enter your value: football
Traceback (most recent call last):

  File "C:\Users\Chris\Desktop\chatbot_simple\run.py", line 79, in <module>
    print(responses(str(v)))

  File "C:\Users\Chris\Desktop\chatbot_simple\run.py", line 68, in responses
    response = response+answer[id1]

IndexError: list index out of range

The csv:

Q,A
Who has scored the most goals in the premier league?,Alan shearer holds the goal record in the premier league.
Who has the most appearences in the premier league?,Gareth Barry has the most appearences in premier league history.

I've tried deleting the variable after each input but it still somehow remembers it, anyone have any ideas ? Thanks Chris


Solution

  • answer=data['A'].tolist()
    

    and then later on

    id1=val.argsort()[0][-2]
    response = response+answer[id1]
    

    So if the anwser don't have id1 in it you will get index out of range. So in your case the len(answer) >= id1 is true.