r word2vec word-embedding cosine-similarity

Calculate Cosine Similarity for a word2vec model in R

I´m working with the package "word2vec" model in R and got a huge problem. I wanna figure out which words are the closest synonyms to "uncertainty" and "economy" like the paper of Azqueta-Gavaldon (2020): "Economic policy uncertainty in the euro area: An unsupervised machine learning approach".So I did the word2vec function of the word2vec package to create my own word2vec model. With the function predict (object, ...) I can create a table which shows me the words which are closest to my considered words.The problem is that the similarity of this function is defined as the (sqrt(sum(x . y) / ncol(x))) which is not the cosine similarity. I know that I can use the function cosine(x,y). This function but just works to calculate the cosine similarity between two vectors and can´t do the output like the predict function which I described above.

Does anyone know how to determine the cosine similarity for each word in my Word2Vec model to the other and give me an output of the most similar words to a given word based on these values?

This would really help me a lot and I am already grateful for your answers.

Kind regards, Tom

Solution

following github-code explains how you can use the cosine similarity in Word2Vec Models in R: https://gist.github.com/adamlauretig/d15381b562881563e97e1e922ee37920

You can use this function at every matrix in R and therefore for every Word2Vec Model built in R.

Kind Regards, Tom