Search code examples
pythontensorflownlplanguage-model

Use BERT for feature extraction of a unique word


I am using BERT for feature extraction of a word given the text where it appears, but it seems current implementation in bert's official github (https://github.com/google-research/bert) can only compute the features of all the words in text, which makes it consume too much resources. Is it possible to adapt it for this purporse? Thanks!!


Solution

  • BERT is not a context-free transformer, which means that you don't want to use it for a single word as you would use word2vec. It's kind of the point really -- you want to contextualise your input. I mean you can have a one-word sentence input but then why not just use word2vec.

    Here's what the README says:

    Pre-trained representations can also either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. Context-free models such as word2vec or GloVe generate a single "word embedding" representation for each word in the vocabulary, so bank would have the same representation in bank deposit and river bank. Contextual models instead generate a representation of each word that is based on the other words in the sentence.

    Hope that makes sense :-)