How can i generate non-english (french , spanish , italian ) word embedding from english word embedding ?
What are the best ways to generate high quality word embedding for non - english words .
Words may include (samsung-galaxy-s9)
For non-english words, you can try to use a bilingual dictionary to translate English words with embedding vectors.
You need a large corpus to generate high-quality word embeddings. For non-english, you need to add the bilingual constraints into the original w2v loss with the input of bilingual corpora.
You can regard the compound word as a whole word or split it according to your applications.