Search code examples
pythonneural-networknlptext-miningword2vec

How to generate Word2vec Vectors in Python?


I am trying to generate Word2vec vectors.

I have pandas data frame.

I transformed it into tokens.

df["token"]

Used Word2vec from gensim.models

model = w2v.Word2Vec(
sentences=df["token"],
seed=seed,
workers=num_workers,
size=num_features,
min_count=min_word_count,
window=context_size,
sample=downsampling
)

How do I transform my dataframe df now?

That is what is the equivalent of doing

model.transform(df)

Solution

  • If your dataframe is composed only of words, you could just make

    df['new_column'] = model[df['words']]
    

    model['word'] or model[list()] both give you the vector representation of your word or of your list