Search code examples
kerasword2vecpos-tagger

Text classification using Word2Vec and Pos tag


I have a medical dataset like

Text: "weakness, diarrhea, neck pain" Target:"X.1, Y.1" which is coded diagnosis

Also I am using pre-trained Word2Vec and pos tagging. For example the word weakness has Word vector like

[0.2 0.04 ........ 0.05] (300 dim)

And pos tagging is "Symptom, Noun"

My question is how to combine pos tagging and word embedding to train with keras ?


Solution

  • There are multiple ways to deal with that.

    1. You can build an ensemble model, i.e., you can train with pos tags and word2vec seperately using two different models. If you get the prediction value at the final layer (or some interpretation of probability in any model), you can take the average for your final prediction.

    2. You can combine word2vec with pos tags to run a neural network.

    However, I strongly believe POS tags will not be a good idea in these cases. You can see, all these words may have similar pos tags (most are isolated words and nouns), and data will have much less entropy.