How to train word2vec with your own vocab

I am getting error while training word2vec with my own vocabulary. I am also not getting why its happening.

Code:

from gensim.models import word2vec
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)

sentences = word2vec.LineSentence('test_data')

model = word2vec.Word2Vec(sentences, size=20)
model.build_vocab(sentences,update=True)
model.train(sentences)

print model.most_similar(['course'])

It throws an error

2017-08-27 16:50:04,590 : INFO : precomputing L2-norms of word weight vectors
Traceback (most recent call last):
  File "tryword2vec.py", line 23, in <module>
    print model.most_similar(['course']) 
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/word2vec.py", line 1285, in most_similar
    return self.wv.most_similar(positive, negative, topn, restrict_vocab, indexer)
  File "/usr/local/lib/python2.7/dist-packages/gensim/models/keyedvectors.py", line 97, in most_similar
    raise KeyError("word '%s' not in vocabulary" % word)
KeyError: "word 'course' not in vocabulary"

test_data contains:

Bachelor of Engg is a course. M.Tech is a course. ME is a course. B.Tech is a course. Bachelor of Arts is a course. Fashion Design is a course. Multimedia is a course. Mechanical engg is a course. Computer Science is a course. Electronics is a cource. Engineering is a course. MBA is a course. BBA is a course.

Any help is appreciated?

Solution

The reason you are not getting the error is because the word course is not in the vocabulary. Instead the word present is course.

There is a period "." at the end of course.

check your vocabulary model.wv.vocab

{u'a': <gensim.models.keyedvectors.Vocab at 0x7fe086c461d0>,
 u'course.': <gensim.models.keyedvectors.Vocab at 0x7fe0b4704f90>,
 u'is': <gensim.models.keyedvectors.Vocab at 0x7fe086ba0d10>}

And do hide your api keys.