Search code examples
embeddingrecurrent-neural-network

Embeddings with recurrent neural networks


I am working on a research project on text data (it's about search engine queries supervised classification). I have already implemented different methods and I have also used different models for the text (such as binary vectors of the dimention of my vocabulary - 1 if the i-th word appears in the text, 0 otherwise - or words embedding with the model word2vec).

My advisor told me that maybe we could find another representation of the queries using Recurrent Neural Network. This representation should keep into account the sequentiality of the words in the text thanks to the recurrence relation. I have read some documentation about RNN but I haven't find anything useful for this goal. I have read lot of things about language modelling (which predict probabilities of the words), but I don't understand how I could adapt this model in order to obtain something like an embedded vector.

Thank you very much!


Solution

  • Usually, if one wants to obtain embeddings from a query or a sentence exploiting RNN, the logits are used. The logits are simply the output values of the network after the forward pass of the full sentence/query.

    The logit values produce a vector that has the dimensions of the output layer (i.e. number of the target classes): usually, it is the vocabulary, since they are extracted from a language model.

    For hints have a look at these:

    Note that in principle one could use also use bidirectional networks or networks trained on other tasks, obtaining smaller embeddings, even if this last option is kind of fancy and it has not been explored up to my knowledge.