Search code examples
javagensimword2vecdeeplearning4j

Is it possible to use gensim word2vec model in deeplearning4j.word2vec?


I'm new to deeplearning4j, i want to make sentence classifier using words vector as input for the classifier. I was using python before, where the vector model was generated using gensim, and i want to use that model for this new classifier. Is it possible to use gensim's word2vec model in deeplearning4j.word2vec and how i can do that?


Solution

  • Yes, it's possible since Word2Vec implementation defines a standard to structure its model.

    To do this:

    1. Using gensim, save the model compatible with Word2Vec implementation:

      w2v_model.wv.save_word2vec_format("path/to/w2v_model.bin", binary=True)
      
    2. From DL4J, load the same pre-trained model:

      Word2Vec w2vModel = WordVectorSerializer.readWord2VecModel("path/to/w2v_model.bin");
      

    In fact, you could test the model in both codes and you should see the same results, for instance:

    With gensim:

    print(w2v_model.most_similar("love"))
    print(w2v_model.n_similarity(["man"], ["king"]))
    

    And with DL4J:

    System.out.println(w2vModel.wordsNearest("love", 10));
    System.out.println(w2vModel.similarity("man", "king"));