I'm using the Gensim package. However, when I want to load the word2vec model, the gensim.downloader
function seems not to exist.
w2v = gensim.downloader.load('word2vec-google-news-300')
Got error message:
AttributeError: module 'gensim' has no attribute 'downloader'
I checked the directory of gensim using dir() method and here's what I got:
['__builtins__','__cached__','__doc__','__file__','__loader__','__name__','__package__','__path__','__spec__','__version__','_matutils','corpora','interfaces','logger','logging','matutils','models','parsing','similarities','topic_coherence','utils']
Seems like the downloader method is not in the directory. I wonder if there's another way to download a specific pretrained model with gensim library and also what's wrong with the gensim downloader.
My gensim version is 4.2.0.
If you're following some example code, you should copy its imports & code exactly. I don't think you'll find any docs/examples suggesting to use the gensim.downloader
module the way you've attempted.
More generally: I recommend against using gensim.downloader
. It hides the actual sources, local paths, & return types of the data it retrieves, and also runs new code, from the net, that's not part of the Gensim project source-control nor part of versioned Gensim releases. (It's a sketchy software-engineering practice.)
Instead, download the GoogleNews
dataset directly from some host, saving the exact original file(s) to a specific place of your choosing. Examine the downloads to understand their filenames/formats (decompressing if necessary).
Then use other Gensim methods – such as KeyedVectors.load_word2vec_format()
– to load from a specific known local file path, with a returned object of a specific documented type.
Your code (and your own understanding) will be more clear, robust, & secure.