Search code examples
pythonstringpytorchattributeerrorhuggingface-transformers

AttributeError: 'str' object has no attribute 'shape' while encoding tensor using BertModel with PyTorch (Hugging Face)


AttributeError: 'str' object has no attribute 'shape' while encoding tensor using BertModel with PyTorch (Hugging Face). Below is the code

bert_model = BertModel.from_pretrained(r'downloads\bert-pretrained-model')
input_ids

Output is:

tensor([[  101,   156, 13329,  ...,     0,     0,     0],
        [  101,   156, 13329,  ...,     0,     0,     0],
        [  101,  1302,  1251,  ...,     0,     0,     0],
        ...,
        [  101, 25456,  1200,  ...,     0,     0,     0],
        [  101,   143,  9664,  ...,     0,     0,     0],
        [  101,  2586,  7340,  ...,     0,     0,     0]])

Followed by code below

last_hidden_state, pooled_output = bert_model(
  input_ids=encoding['input_ids'],
  attention_mask=encoding['attention_mask']
)

Followed by code below

last_hidden_state.shape

Output is

AttributeError                            Traceback (most recent call last)
<ipython-input-70-9628339f425d> in <module>
----> 1 last_hidden_state.shape

AttributeError: 'str' object has no attribute 'shape'

Complete Code link is 'https://colab.research.google.com/drive/1FY4WtqCi2CQ9RjHj4slZwtdMhwaWv2-2?usp=sharing'


Solution

  • The issue is that the return type has changed since 3.xx version of transformers. So, we have explicitly ask for a tuple of tensors.

    So, we can pass an additional kwarg return_dict = False when we call the bert_model() to get an actual tensor that corresponds to the last_hidden_state.

    last_hidden_state, pooled_output = bert_model(
      input_ids=encoding['input_ids'],
      attention_mask=encoding['attention_mask'],
      return_dict = False   # this is needed to get a tensor as result
    )
    

    In case you do not like the previous approach, then you can resort to:

    In [13]: bm = bert_model(
        ...:   encoding_sample['input_ids'],
        ...:   encoding_sample['attention_mask']
        ...: )
    
    In [14]: bm.keys()
    Out[14]: odict_keys(['last_hidden_state', 'pooler_output'])
    
    # accessing last_hidden_state 
    In [15]: bm['last_hidden_state']
    
    In [16]: bm['last_hidden_state'].shape
    Out[16]: torch.Size([1, 17, 768])