neural-network nlp pytorch huggingface-transformers

Finetuning LayoutLM on FUNSD-like dataset - index out of range in self

I'm experimenting with huggingface transformers to finetune microsoft/layoutlmv2-base-uncased through AutoModelForTokenClassification on my custom dataset that is similar to FUNSD (pre-processed and normalized). After a few iterations of training I get this error :

 Traceback (most recent call last):
  File "layoutlmV2/train.py", line 137, in <module>
    trainer.train()
  File "..../lib/python3.8/site-packages/transformers/trainer.py", line 1409, in train
    return inner_training_loop(
  File "..../lib/python3.8/site-packages/transformers/trainer.py", line 1651, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "..../lib/python3.8/site-packages/transformers/trainer.py", line 2345, in training_step
    loss = self.compute_loss(model, inputs)
  File "..../lib/python3.8/site-packages/transformers/trainer.py", line 2377, in compute_loss
    outputs = model(**inputs)
  File "..../lib/python3.8/site-packages/torch/nn/modules/module.py", line 1131, in _call_impl
    return forward_call(*input, **kwargs)
  File "..../lib/python3.8/site-packages/transformers/models/layoutlmv2/modeling_layoutlmv2.py", line 1228, in forward
    outputs = self.layoutlmv2(
  File "..../lib/python3.8/site-packages/torch/nn/modules/module.py", line 1131, in _call_impl
    return forward_call(*input, **kwargs)
  File "..../lib/python3.8/site-packages/transformers/models/layoutlmv2/modeling_layoutlmv2.py", line 902, in forward
    text_layout_emb = self._calc_text_embeddings(
  File "..../lib/python3.8/site-packages/transformers/models/layoutlmv2/modeling_layoutlmv2.py", line 753, in _calc_text_embeddings
    spatial_position_embeddings = self.embeddings._calc_spatial_position_embeddings(bbox)
  File "..../lib/python3.8/site-packages/transformers/models/layoutlmv2/modeling_layoutlmv2.py", line 93, in _calc_spatial_position_embeddings
    h_position_embeddings = self.h_position_embeddings(bbox[:, :, 3] - bbox[:, :, 1])
  File "..../lib/python3.8/site-packages/torch/nn/modules/module.py", line 1131, in _call_impl
    return forward_call(*input, **kwargs)
  File "..../lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 158, in forward
    return F.embedding(
  File "..../lib/python3.8/site-packages/torch/nn/functional.py", line 2203, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

After further inspection (vocab size, bboxes, dimensions, classes...) I noticed that there's negative values inside the input tensor causing the error. While input tensors of successful previous iterations have unsigned integers only. These negative numbers are returned by _calc_spatial_position_embeddings(self, bbox) in modeling_layoutlmv2.py

line 92 :

h_position_embeddings = self.h_position_embeddings(bbox[:, :, 3] - bbox[:, :, 1])

What may cause the returned input values to be negative?
What could I do to prevent this error from happening?

Example of the input tensor that triggers the error in torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) :

tensor([[ 0, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11,  9,  9,  9,  9,  9,  9,  9,  9,  9,
          9,  9,  9,  9,  9,  9,  9, 10, 10, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12,
         12, 12, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
         10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 12, 12, 12, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
         11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12,
         12, 12, 12, 12, 12, 12, 12, 12,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,
          8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,  8,
          8,  5,  5,  5,  5,  5,  5, -6, -6, -6, -6, -6, -6,  1,  1,  1,  1,  1,
          5,  5,  5,  5,  5,  5,  7,  5,  7,  7,  0,  0,  0,  0,  0,  0,  0,  0,
          0,  0,  0,  0,  0,  0,  0,  0]])

Solution

After double checking the dataset and specifically the coordinates of the labels, I've found that some rows bbox coordinates lead to zero width or height. Here's a simplified example:

x1, y1, x2, y2 = dataset_row["bbox"]
print((x2-x1 < 1) or (y2-y1 < 1)) #output is sometimes True

After removing these labels from the dataset, the issue was resolved.