Search code examples
rnlpword-embeddingtext2vecglove

Read GloVe pre-trained embeddings into R, as a matrix


Working in R. I know the pre-trained GloVe embeddings (e.g., "glove.6B.50d.txt") can be found here: https://nlp.stanford.edu/projects/glove/. However, I've had zero luck reading this text file into R so that the product is the word embedding matrix of words by vectors. Has anyone successfully done this, either pulling from a saved .txt file or from the site itself, and if so how was that text converted to a matrix in R?


Solution

  • The text file is already in a tabular form, just use read.csv("path/to/glove.6B.50d.txt", sep = " ") - note that the field/cell separator, in this case, is a space, not a comma.