Search code examples
word2vec

Does word2vec supports multiple languages?


I want to know if we can use word2vec algorithm to train models for languages other than English like Spanish, Chinese,Italian ?


Solution

  • Yes! In fact one of Google's original word2vec papers highlighted its potential for use in machine-translation between language pairs:

    Exploiting Similarities among Languages for Machine Translation

    Note that as with English, you'll need to break example texts into word-tokens before feeding them to the Word2Vec algorithm, which may be harder in some languages.