Search code examples
python-3.xnlpkaggle

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices"


I am trying to run W2V algorithm. I find index error and not sure where I am going wrong. Here's the error:

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

and here's the code:

    def makeFeatureVec(words, model, num_features):
# Function to average all of the word vectors in a given
# paragraph
#
# Pre-initialize an empty numpy array (for speed)
featureVec = np.zeros((num_features,),dtype="float32")
#
nwords = 0.
# 
# Index2word is a list that contains the names of the words in 
# the model's vocabulary. Convert it to a set, for speed 
index2word_set = set(model.wv.index2word)
#
# Loop over each word in the review and, if it is in the model's
# vocaublary, add its feature vector to the total
for word in words:
    if word in index2word_set: 
        nwords = nwords + 1.
        featureVec = np.add(featureVec,model[word])
# 
# Divide the result by the number of words to get the average
featureVec = np.true_divide(featureVec,nwords)
return featureVec

    def getAvgFeatureVecs(reviews,model,num_features):
# Given a set of reviews (each one a list of words), calculate 
# the average feature vector for each one and return a 2D numpy array 
# 
# Initialize a counter
counter = 0.
# 
# Preallocate a 2D numpy array, for speed
reviewFeatureVecs = np.zeros((len(reviews),num_features),dtype="float32")
# 
# Loop through the reviews
for review in reviews:
   #
   # Print a status message every 1000th review
    if counter%1000. == 0.:
        print ("Review %d of %d" % (counter, len(reviews)))
   # 
   # Call the function (defined above) that makes average feature vectors
    reviewFeatureVecs[counter] = makeFeatureVec(review, model,num_features)
   #
   # Increment the counter
    counter = counter + 1.
return reviewFeatureVecs

This piece of code is from Bag-of-Words-Meets-Bags-of-Popcorn-Kaggle. I am not sure where the error is. I thing np.divide is raisng an error. I am working on windows


Solution

  • counter = counter + 1.

    should be

    counter = counter + 1 (note the dot) or counter += 1.

    The dot makes counter a float (since 1. is equivalent to 1.0) and floats can not be used as indexes.