I am getting this error when I am trying to run word2vec from gensim library of python. I am using python 3.4 and OS is windows 7. I have also attached complete stacktrace as well. I read online and it says that this is an issue with python 2.x, but I am getting in python 3.4
model = word2vec.Word2Vec(sentences, workers=num_workers, \
size=num_features, min_count = min_word_count, \
window = context, sample = downsampling)
Traceback (most recent call last):
File "<pyshell#137>", line 3, in <module>
window = context, sample = downsampling)
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 417, in __init__
self.build_vocab(sentences)
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 483, in build_vocab
self.finalize_vocab() # build tables & arrays
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 611, in finalize_vocab
self.reset_weights()
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 888, in reset_weights
self.syn0[i] = self.seeded_vector(self.index2word[i] + str(self.seed))
File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 900, in seeded_vector
once = random.RandomState(uint32(self.hashfxn(seed_string)))
OverflowError: Python int too large to convert to C long
Take note that python (2 and 3) support integers of arbitrary size - python will just keep on adding additional "digits" (actually groups of them in long
s) when you reach the current maximum size. The only differences between py2 and py3 is that the former will start with an actual C int
or long
before going to the arbitrary size python long
. In py3, you always get the python long
type.
Long story short: check the size of your integers.