How is it possible to initialize BERT with random weights? I want to compare the performance of multilingual vs monolingual vs randomly initialized BERT in a masked language modeling task. While in the former cases it is very straightforward:
from transformers import BertTokenizer, BertForMaskedLM
tokenizer_multi = BertTokenizer.from_pretrained('bert-base-multilingual-cased')
model_multi = BertForMaskedLM.from_pretrained('bert-base-multilingual-cased')
model_multi.eval()
tokenizer_mono = BertTokenizer.from_pretrained('bert-base-cased')
model_mono = BertForMaskedLM.from_pretrained('bert-base-cased')
model_mono.eval()
I don't know how to load random weights.
Thanks in advance!
You can initialize a random BERT model using the Hugginface capabilites (from the documentation https://huggingface.co/docs/transformers/v4.28.1/en/model_doc/bert#transformers.BertConfig)
from transformers import BertConfig, BertModel
# Initializing a BERT bert-base-uncased style configuration
configuration = BertConfig()
# Initializing a model (with random weights) from the bert-base-uncased style configuration
model = BertModel(configuration)
# Accessing the model configuration
configuration = model.config