Search code examples
pytorchrecurrent-neural-networklanguage-modelbert-language-model

BERT:Question-Answering - Total number of permissible words/tokens for training


Let's say I want to train BERT with 2 sentences (query-answer) pair against a certain binary label (1,0) for the correctness of the answer, will BERT let me use 512 words/tokens each for the query and the answer or together(query+answer combined) they should be 512? [510 upon ignoring the [start] and [sep] token]

Thanks in advance!


Solution

  • Together, and actually it's together they should be 509 since there are two [SEP], one after question and another after answer:

    [CLS] q_word1 q_word2 ... [SEP] a_word1 a_word2 ... [SEP]
    

    where q_word refers to words in the question and a_word refers to words in the answer