DeBERTa ONNX export does not work for token_type_ids

I have noticed that this PR already adds the support for DeBERTa kind of a model to be exported as ONNX. I read through the PR and checked everything possible. However I can't make the following code work.

from transformers import AutoTokenizer, AutoConfig, DebertaTokenizerFast, pipeline, DebertaV2Tokenizer, __version__
from optimum.onnxruntime import ORTModelForTokenClassification, ORTModelForQuestionAnswering

tokenizer = AutoTokenizer.from_pretrained("{custom_fine_tuned_NER_DeBERTaV2}")
model = ORTModelForTokenClassification.from_pretrained("{custom_fine_tuned_NER_DeBERTaV2}", 
                                                                                  export=True, use_auth_token=True)

pipe = pipeline("ner", model=model, tokenizer=tokenizer)

It fails with the following stack trace -

InvalidArgument                           Traceback (most recent call last)
Cell In[25], line 1
----> 1 pipe("I am a skilled engineer. I have worked in JS, CPP, Java, J2ME, and Python. I know Oracle and MySQL")

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/transformers/pipelines/, in TokenClassificationPipeline.__call__(self, inputs, **kwargs)
    211 if offset_mapping:
    212     kwargs["offset_mapping"] = offset_mapping
--> 214 return super().__call__(inputs, **kwargs)

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/transformers/pipelines/, in Pipeline.__call__(self, inputs, num_workers, batch_size, *args, **kwargs)
   1101     return next(
   1102         iter(
   1103             self.get_iterator(
   1106         )
   1107     )
   1108 else:
-> 1109     return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/transformers/pipelines/, in Pipeline.run_single(self, inputs, preprocess_params, forward_params, postprocess_params)
   1114 def run_single(self, inputs, preprocess_params, forward_params, postprocess_params):
   1115     model_inputs = self.preprocess(inputs, **preprocess_params)
-> 1116     model_outputs = self.forward(model_inputs, **forward_params)
   1117     outputs = self.postprocess(model_outputs, **postprocess_params)
   1118     return outputs

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/transformers/pipelines/, in Pipeline.forward(self, model_inputs, **forward_params)
   1013     with inference_context():
   1014         model_inputs = self._ensure_tensor_on_device(model_inputs, device=self.device)
-> 1015         model_outputs = self._forward(model_inputs, **forward_params)
   1016         model_outputs = self._ensure_tensor_on_device(model_outputs, device=torch.device("cpu"))
   1017 else:

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/transformers/pipelines/, in TokenClassificationPipeline._forward(self, model_inputs)
    238     logits = self.model([0]
    239 else:
--> 240     output = self.model(**model_inputs)
    241     logits = output["logits"] if isinstance(output, dict) else output[0]
    243 return {
    244     "logits": logits,
    245     "special_tokens_mask": special_tokens_mask,
    248     **model_inputs,
    249 }

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/optimum/, in OptimizedModel.__call__(self, *args, **kwargs)
     84 def __call__(self, *args, **kwargs):
---> 85     return self.forward(*args, **kwargs)

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/optimum/onnxruntime/, in ORTModelForTokenClassification.forward(self, input_ids, attention_mask, token_type_ids, **kwargs)
   1360     onnx_inputs["token_type_ids"] = token_type_ids
   1362 # run inference
-> 1363 outputs =, onnx_inputs)
   1364 logits = outputs[self.output_names["logits"]]
   1366 if use_torch:

File ~/skill_extraction/skill_extraction/lib/python3.10/site-packages/onnxruntime/capi/, in, output_names, input_feed, run_options)
    198     output_names = [ for output in self._outputs_meta]
    199 try:
--> 200     return, input_feed, run_options)
    201 except C.EPFail as err:
    202     if self._enable_fallback:

InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid Feed Input Name:token_type_ids

I am not sure what I am doing wrong? The same code works for other models (such as another model which BERT based and then fine tuned by me)

I am using tokenzier version 0.13.3 transformers version 4.27.4 and optimum version 1.7.3 I am on a AMD based machine in EC2 (AWS)

Please help as this is blocking me from optimizing the model. I can't find any docs on it.


  • I have the same error. On way to solve it was removing "token_type_ids" while tokenizing the text but keep only 'input_ids', 'attention_mask'

    tokenizer.model_input_names = ['input_ids', 'attention_mask']