tensorflow tensorflow2.0 huggingface-transformers bert-language-model transformer-model

ValueError: Exception encountered when calling layer 'tf_bert_model' (type TFBertModel)

I have been trying to run TFBertModel from Transformers, but it kept on throwing me this error

ValueError                                Traceback (most recent call last)
Cell In[9], line 1
----> 1 bert_output = bert_model([input_ids, attention_mask])

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\tf_keras\src\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\modeling_tf_utils.py:436, in unpack_inputs.<locals>.run_call_with_unpacked_inputs(self, *args, **kwargs)
    433 else:
    434     config = self.config
--> 436 unpacked_inputs = input_processing(func, config, **fn_args_and_kwargs)
    437 return func(self, **unpacked_inputs)

File ~\AppData\Local\Programs\Python\Python312\Lib\site-packages\transformers\modeling_tf_utils.py:530, in input_processing(func, config, **kwargs)
    528             output[parameter_names[i]] = input
    529         else:
--> 530             raise ValueError(
    531                 f"Data of type {type(input)} is not allowed only {allowed_types} is accepted for"
    532                 f" {parameter_names[i]}."
    533             )
    534 elif isinstance(main_input, Mapping):
    535     if "inputs" in main_input:

ValueError: Exception encountered when calling layer 'tf_bert_model' (type TFBertModel).

Data of type <class 'keras.src.backend.common.keras_tensor.KerasTensor'> is not allowed only (<class 'tensorflow.python.framework.tensor.Tensor'>, <class 'bool'>, <class 'int'>, <class 'transformers.utils.generic.ModelOutput'>, <class 'tuple'>, <class 'list'>, <class 'dict'>, <class 'numpy.ndarray'>) is accepted for input_ids.

Call arguments received by layer 'tf_bert_model' (type TFBertModel):
  • input_ids=['<KerasTensor shape=(None, 128), dtype=int32, sparse=False, name=input_ids>', '<KerasTensor shape=(None, 128), dtype=int32, sparse=False, name=attention_mask>']
  • attention_mask=None
  • token_type_ids=None
  • position_ids=None
  • head_mask=None
  • inputs_embeds=None
  • encoder_hidden_states=None
  • encoder_attention_mask=None
  • past_key_values=None
  • use_cache=None
  • output_attentions=None
  • output_hidden_states=None
  • return_dict=None
  • training=False

This error is coming from this particular line

bert_output = bert_model([input_ids, attention_mask])

Here is the entire code, im new to using BERT and followed ChatGPT

from transformers import TFBertModel
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras import layers

bert_model = TFBertModel.from_pretrained('bert-base-uncased')

input_ids = tf.keras.layers.Input(shape=(128,), dtype='int32', name='input_ids')
attention_mask = tf.keras.layers.Input(shape=(128,), dtype='int32', name='attention_mask')

bert_output = bert_model([input_ids, attention_mask]) # <= this line is what giving the error

pooled_output = bert_output.pooler_output

# Custom classifier layers
x = layers.Dense(128, activation='relu')(pooled_output)
x = layers.Dropout(0.3)(x)
output = layers.Dense(2, activation='softmax')(x)  # Adjust num_classes as needed

# Build model
model = Model(inputs=[input_ids, attention_mask], outputs=output)

# Compile the model
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=5e-5),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(train_dataset, epochs=3, validation_data=test_dataset)

This error persists even after researching and trying to solve it from ages now, i can't seem to solve nor understand the issue, i have tried downgrading both Tensorflow and Transformers version (4.17), but the error still persists, Tried asking Gemini and ChatGPT and they aren't able to solve it either and kept saying that i should use Tensorflow's tensor and not KerasTensor, then proceed to provide me the same code.

I'm using the latest version of both Tensorflow (2.18.0) and Transformers (4.47.1) and my Python version is 3.12

I also tried downgrading the transformers version to 4.17 however, it still persists, i tried passing input_ids and attention_mask differently such as this:

bert_output = bert_model([{'input_ids': input_ids, 'attention_mask': attention_mask}])

but the error remains

Solution

TLDR; use this

import os

os.environ['TF_USE_LEGACY_KERAS'] = '1'

Explanation: Transformers package uses Keras 2 objects, current version is Keras 3, packed in Tensorflow since version 2.16. Fastest fix without downgrading tensorflow is to set legacy keras usage flag as above. More info can be found here.