Search code examples
n-gram

Ideal number of <BOS> tags in N-gram Language Model


Let us assume there is a sentence "There is a monkey". Now, let us try to create Trigrams after appending Beggining of String, End of String (<BOS>, <EOS>) tags to the string.

Which one of the following is true for the first trigram?

a. <BOS>, <BOS>, There
b. <BOS>, There, is

Which one of the following is true for the last trigram?

a. monkey, <EOS>, <EOS>
b. a, monkey, <EOS>

Solution

  • The sequence would be the following upon appending <BOS> and <EOS> tags:

    <BOS> <BOS> There is a monkey <EOS>.

    Therefore, the first trigram would be

    <BOS> <BOS> There

    The last trigram would be

    a monkey <EOS>