Search code examples
python-2.7python-3.xmachine-learningword2vec

Python int too large to convert to C long in python 3.4


I am getting this error when I am trying to run word2vec from gensim library of python. I am using python 3.4 and OS is windows 7. I have also attached complete stacktrace as well. I read online and it says that this is an issue with python 2.x, but I am getting in python 3.4

model = word2vec.Word2Vec(sentences, workers=num_workers, \
        size=num_features, min_count = min_word_count, \
        window = context, sample = downsampling)

Traceback (most recent call last):
  File "<pyshell#137>", line 3, in <module>
    window = context, sample = downsampling)
  File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 417, in __init__
    self.build_vocab(sentences)
  File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 483, in build_vocab
    self.finalize_vocab()  # build tables & arrays
  File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 611, in finalize_vocab
    self.reset_weights()
  File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 888, in reset_weights
    self.syn0[i] = self.seeded_vector(self.index2word[i] + str(self.seed))
  File "C:\Python34\lib\site-packages\gensim\models\word2vec.py", line 900, in seeded_vector
    once = random.RandomState(uint32(self.hashfxn(seed_string)))
OverflowError: Python int too large to convert to C long

Solution

  • Take note that python (2 and 3) support integers of arbitrary size - python will just keep on adding additional "digits" (actually groups of them in longs) when you reach the current maximum size. The only differences between py2 and py3 is that the former will start with an actual C int or long before going to the arbitrary size python long. In py3, you always get the python long type.

    Long story short: check the size of your integers.