pytorch recurrent-neural-network language-model bert-language-model

BERT:Question-Answering - Total number of permissible words/tokens for training

Let's say I want to train BERT with 2 sentences (query-answer) pair against a certain binary label (1,0) for the correctness of the answer, will BERT let me use 512 words/tokens each for the query and the answer or together(query+answer combined) they should be 512? [510 upon ignoring the [start] and [sep] token]

Thanks in advance!

Solution

Together, and actually it's together they should be 509 since there are two [SEP], one after question and another after answer:

[CLS] q_word1 q_word2 ... [SEP] a_word1 a_word2 ... [SEP]

where q_word refers to words in the question and a_word refers to words in the answer