Search code examples
bert-language-modelhuggingface-transformers

Initialize HuggingFace Bert with random weights


How is it possible to initialize BERT with random weights? I want to compare the performance of multilingual vs monolingual vs randomly initialized BERT in a masked language modeling task. While in the former cases it is very straightforward:

from transformers import BertTokenizer, BertForMaskedLM

tokenizer_multi = BertTokenizer.from_pretrained('bert-base-multilingual-cased')
model_multi = BertForMaskedLM.from_pretrained('bert-base-multilingual-cased')
model_multi.eval()

tokenizer_mono = BertTokenizer.from_pretrained('bert-base-cased')
model_mono = BertForMaskedLM.from_pretrained('bert-base-cased')
model_mono.eval()

I don't know how to load random weights.

Thanks in advance!


Solution

  • You can initialize a random BERT model using the Hugginface capabilites (from the documentation https://huggingface.co/docs/transformers/v4.28.1/en/model_doc/bert#transformers.BertConfig)

    from transformers import BertConfig, BertModel
    
    # Initializing a BERT bert-base-uncased style configuration
    configuration = BertConfig()
    
    # Initializing a model (with random weights) from the bert-base-uncased style configuration
    model = BertModel(configuration)
    
    # Accessing the model configuration
    configuration = model.config