Search code examples
pythondeep-learninghuggingface-transformersonnxruntime

ONNX runtime bert inference: RuntimeError: Input must be a list of dictionaries or a single numpy array for input 'attention_mask'


I am trying to use Huggingface Bert model using onnx runtime. I have used the the docs to convert the model and I am trying to run inference. My inference code is:

from transformers import BertTokenizer, BertModel, BertTokenizerFast
import onnxruntime

sess = onnxruntime.InferenceSession("onnx/bert-base-cased/model.onnx")
tokenizer = BertTokenizerFast.from_pretrained('bert-base-cased')

encoded_input = tokenizer(text, return_tensors='pt', padding='max_length')
output = sess.run([i.name for i in sess.get_outputs()], dict(encoded_input)) # or sess.run(None, input_dict)

I am getting the following error:

Traceback (most recent call last):
  File "/home/srg/glib-repos/invoice_locality_extraction/cloud_run_functions/name_extraction/main.py", line 94, in invoice_extractor
    inference_results = infer.infer(v)
  File "/home/srg/glib-repos/invoice_locality_extraction/cloud_run_functions/name_extraction/infer.py", line 111, in infer
    emb, call = process(tokenizer, model, item_text_results[i:i+batch_size], call+1)
  File "/home/srg/glib-repos/invoice_locality_extraction/cloud_run_functions/name_extraction/get_embeddings.py", line 50, in process
    output = model.run([i.name for i in model.get_outputs()], input_dict)
  File "/home/sajan/pdf2words-env/lib/python3.7/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 192, in run
    return self._sess.run(output_names, input_feed, run_options)
RuntimeError: Input must be a list of dictionaries or a single numpy array for input 'attention_mask'.

Solution

  • According to the docs the return_tensors='np' not return_tensors='pt'