Search code examples
pythongensimdoc2vec

What is correct way to get doc vectors values?


How I can obtain specific doc vector values? By tag, like this:

modelValues = model.docvecs['myDocTag']

or it is possible only by index, like this:

modelValues = model.docvecs[12]

(in last case, I must know matching tagindex...)


Solution

  • You can use either but should use the same sort of tag keys as were provided during training.

    So if your tagged-documents during training had a string tag of 'myDocTag', you should use model.docvecs['myDocTag'].

    If you explicitly provided plain int tags, you could use model.docvecs[12]. (But note in such a case, you should be careful to assign contiguous ints starting from 0.)