I train a pre trained BERT model on my data.
I try to make a Json containing two list:
first: a list conclude prediction of model (desire value)
second: a list of true value
but the first list has many ['UNK'] token in it
some thing like this:
why this happen? and how can I solve it?
this UNK tag make the prediction result near to zero:( because the accuracy rate is base on exact match of true and desire and this UNKs make desire differ...
what can I do for it?
ultimately, I found the problem... the Version of Bert I have used was adapted to Persian language and I was not passed the Persian normalizing process completely:) after completing that phase and some debugging into Bert configuration, it solved:)