Search code examples
pythondeep-learningnlppicklegensim

Error while loading Word2Vec model using linux, but running well in windows


My code here:

from gensim.models import Word2Vec, KeyedVectors
wv_model = KeyedVectors.load('word2vec.model')

It raised errors while running in Ubuntu, but it is running well in Windows11. I tried to change different vesions of it and use pickle and to solve this problem, however raised the same error.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/autodl-tmp/test.py:3, in <module>
      1 import gensim
      2 from gensim.models import Word2Vec, KeyedVectors
----> 3 wv_model = KeyedVectors.load('word2vec.model')

File ~/miniconda3/lib/python3.8/site-packages/gensim/utils.py:486, in SaveLoad.load(cls, fname, mmap)
    482 logger.info("loading %s object from %s", cls.__name__, fname)
    484 compress, subname = SaveLoad._adapt_by_suffix(fname)
--> 486 obj = unpickle(fname)
    487 obj._load_specials(fname, mmap, compress, subname)
    488 obj.add_lifecycle_event("loaded", fname=fname)

File ~/miniconda3/lib/python3.8/site-packages/gensim/utils.py:1461, in unpickle(fname)
   1447 """Load object from `fname`, using smart_open so that `fname` can be on S3, HDFS, compressed etc.
   1448 
   1449 Parameters
   (...)
   1458 
   1459 """
   1460 with open(fname, 'rb') as f:
-> 1461     return _pickle.load(f, encoding='latin1')

TypeError: __randomstate_ctor() takes from 0 to 1 positional arguments but 2 were given

I had three files pretrained.

enter image description here

How could I fix this problem and why it happened?????


Solution

  • There's a fair chance that the root cause of the error is some mismatched interpreter/library versions.

    If so, then by ensuring that you're using the exact same versions of Python, Numpy, & Gensim on the Ubuntu system (where you're getting an error) as you've seen succeed on the Windows11 system, it will work on Ubuntu as well.

    One other less-likely possibility might be some corruption/truncation of the word2vec.model file (& its supporting files). Checking that the files are identical in both places – by size and secure checksum – could rule-out any problems there.