Training for MLM was added based on the Japanese model of BERT. At that time, we used TPU on Google Colab. I get the following error when loading the created model. Is there a way to load the model?
from transformers import BertJapaneseTokenizer, BertForMaskedLM
# Load pre-trained model
model = BertForMaskedLM.from_pretrained('/content/drive/My Drive/Bert/models/sample/')
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/transformers/ in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
469 try:
--> 470 state_dict = torch.load(resolved_archive_file, map_location="cpu")
471 except Exception:
/usr/local/lib/python3.6/dist-packages/torch/ in load(f, map_location, pickle_module, **pickle_load_args)
528 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
--> 529 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
/usr/local/lib/python3.6/dist-packages/torch/ in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
701 unpickler.persistent_load = persistent_load
--> 702 result = unpickler.load()
/usr/local/lib/python3.6/dist-packages/torch/ in _rebuild_xla_tensor(data, dtype, device, requires_grad)
151 def _rebuild_xla_tensor(data, dtype, device, requires_grad):
--> 152 tensor = torch.from_numpy(data).to(dtype=dtype, device=device)
153 tensor.requires_grad = requires_grad
RuntimeError: Could not run 'aten::empty.memory_format' with arguments from the 'XLATensorId' backend. 'aten::empty.memory_format' is only available for these backends: [CUDATensorId, SparseCPUTensorId, VariableTensorId, CPUTensorId, MkldnnCPUTensorId, SparseCUDATensorId].
I ran into the same error while using transformers, this is how I solved it.
After training on Colab, I had to send the model to the CPU. Basically, run:'cpu')
Then save the model, which allowed me to import the weights in another instance.
As implied by the error,
RuntimeError: Could not run 'aten::empty.memory_format' with arguments from the 'XLATensorId' backend. 'aten::empty.memory_format' is only available for these backends: [CUDATensorId, SparseCPUTensorId, VariableTensorId, CPUTensorId, MkldnnCPUTensorId, SparseCUDATensorId]