Search code examples
tensorflowcpu-wordembedding

How to Extract Features from Text based on Fine-Tuned BERT Model


I am trying to make a binary predictor on some data which has one columns with text and some additional columns with numerical values. My first solution was to use word2vec on the text to extract 30 features and use them with the other values in a Random Forest. It produces good result. I am interested in improving the TEXT to FEATURE model.

I then wanted to improve the feature extraction algorithm by using BERT. I managed to implement a pre-trained BERT model for feature extraction with some improvement to the word2vec.

Now I want to know, how can i fine-tune the BERT model on my data - to improve the feature extraction model - to get better text-to-features for my Random Forest algorithm. I know how to fine-tune BERT for a binary predictor (BertForSequenceClassification), but not how to fine-tune it for a making a better BERT text-to-feature extraction model. Can I use the layers in the BertForSequenceClassification somehow?? I spent 2 days trying to find a solution, but did not manage so far...

Kind Regards, Peter


Solution

  • I am dealing with this problem too. As far I know, you must fine-tune the BERT language model; according to this issue, masked LM is suggested. Then you can use Bert-as-service to extract the features. Note that I haven't tested it yet, but I am going to. I thought it would be good to share it with you :)