Search code examples
python-2.7vectornlp

Load Pretrained glove vectors in python


I have downloaded pretrained glove vector file from the internet. It is a .txt file. I am unable to load and access it. It is easy to load and access a word vector binary file using gensim but I don't know how to do it when it is a text file format.


Solution

  • glove model files are in a word - vector format. You can open the textfile to verify this. Here is a small snippet of code you can use to load a pretrained glove file:

    import numpy as np
    
    def load_glove_model(File):
        print("Loading Glove Model")
        glove_model = {}
        with open(File,'r') as f:
            for line in f:
                split_line = line.split()
                word = split_line[0]
                embedding = np.array(split_line[1:], dtype=np.float64)
                glove_model[word] = embedding
        print(f"{len(glove_model)} words loaded!")
        return glove_model
    

    You can then access the word vectors by simply using the gloveModel variable.

    print(gloveModel['hello'])