I was looking through some notebooks in Kaggle just to get a deeper understanding of how NLP works. I came across a notebook for the natural language inference task of predicting the relationship between a given premise and hypothesis. It uses the pretrained BERT model for this task
I had a question about the build_model()
function:
max_len = 50
def build_model():
bert_encoder = TFBertModel.from_pretrained("bert-base-multilingual-cased")
input_word_ids = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
input_mask = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
input_type_ids = tf.keras.Input(shape=(max_len,), dtype=tf.int32, name="input_type_ids")
embedding = bert_encoder([input_word_ids, input_mask, input_type_ids])[0] # confused about this line
output = tf.keras.layers.Dense(3, activation='softmax')(embedding[:,0,:])
model = tf.keras.Model(inputs=[input_word_ids, input_mask, input_type_ids], outputs=output)
model.compile(tf.keras.optimizers.Adam(lr=1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
I am confused about this line: embedding = bert_encoder([input_word_ids, input_mask, input_type_ids])[0]
What does this "embedding" represent and why is there a [0] infront of the function call? Why is the bert_encoder used to instantiate this "embedding"?
Thanks in advance!
logits
You have to put [0]
in order to have torch.Tensor
for computation.
You can also try output.logits
instead of output[0]
ps. I used AutoModelForMaskedLM
, not TFBertModel
. It might be little different, but just try to print out your embedding
first = ]