Just trying to train my model but i am stuck at this error.
#Splitting the data
X_train, X_test, y_train, y_test = train_test_split(tweets,labels, random_state=0)
print (len(X_train),len(X_test),len(y_train),len(y_test))
which produces:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-26-b1b63fa90541> in <module>()
1 #Splitting the data
----> 2 X_train, X_test, y_train, y_test = train_test_split(tweets,labels, random_state=0)
3 print (len(X_train),len(X_test),len(y_train),len(y_test))
2 frames
/usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
210 if len(uniques) > 1:
211 raise ValueError("Found input variables with inconsistent numbers of"
--> 212 " samples: %r" % [int(l) for l in lengths])
213
214
ValueError: Found input variables with inconsistent numbers of samples: [1454711, 0]
I'm not getting how to fix it. Looking forward for suggestions to resolve this error.
According to this answer over at DataScience, as well as the given error values of "[1454711, 0]", your "tweets" and "labels" are not of the same size or length, which train_test_split()
requires.
So check out both that answer and double-check that your "labels" data is correct.