Search code examples
pythontensorflowbert-language-model

Tensorflow expected 2 inputs but received 1 input tensor


Hey guys so I'm building a model based on the Roberta-Base and at the end when I try to fit the model I get a error saying: ValueError: Layer model_39 expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(16, 128) dtype=float64>]

I'm using tf.data.Dataset to make the dataset:

def map_dataset(ids, masks, labels):
    return {'input_ids': ids, 'input_mask': masks}, labels

# Create dataset
dataset = tf.data.Dataset.from_tensor_slices((ids, mask, labels))
dataset.map(map_dataset)
dataset = dataset.shuffle(10000).batch(BATCH_SIZE, drop_remainder=True)

Supposedly dataset is generating 2 inputs properly but for some reason fit is refusing to work and I'm not sure why.

Full code:

LEN_SEQ = 128
BATCH_SIZE = 16
TEST_TRAIN_SPLIT = 0.9
TRANSFORMER = 'roberta-base'

# Load roberta model
base_model = TFAutoModel.from_pretrained('roberta-base')
for layer in base_model.layers:
    layer.trainable = False

# Define input layers
input_ids = tf.keras.layers.Input(shape=(LEN_SEQ,), name='input_ids', dtype='int32')
input_mask = tf.keras.layers.Input(shape=(LEN_SEQ,), name='input_mask', dtype='int32')

# Define hidden layers
embedding = base_model([input_ids, input_mask])[1]
layer = tf.keras.layers.Dense(LEN_SEQ * 2, activation='relu')(embedding)
layer = tf.keras.layers.Dense(LEN_SEQ, activation='relu')(layer)

# Define output
output = tf.keras.layers.Dense(1, activation='softmax', name='output')(layer)

model = tf.keras.Model(inputs=[input_ids, input_mask], outputs=[output])

model.compile(
    optimizer = Adam(learning_rate=1e-3, decay=1e-4),
    loss = CategoricalCrossentropy(),
    metrics = [
        CategoricalAccuracy('accuracy')
    ]
)

# Load data
df = pd.read_csv('train-processed.csv')
df = df.head(100)
samples_count = len(df)

# Tokenize data
tokenizer = AutoTokenizer.from_pretrained(TRANSFORMER)
tokens = tokenizer(
    df['first_Phrase'].tolist(),
    max_length=LEN_SEQ,
    truncation=True,
    padding='max_length',
    add_special_tokens=True,
    return_tensors='tf'
)
ids = tokens['input_ids']
mask = tokens['attention_mask']

def map_dataset(ids, masks, labels):
    return {'input_ids': ids, 'input_mask': masks}, labels

# Create dataset
dataset = tf.data.Dataset.from_tensor_slices((ids, mask, labels))
dataset.map(map_dataset)
dataset = dataset.shuffle(10000).batch(BATCH_SIZE, drop_remainder=True)

# Split data intro train and test
train_size = int((samples_count / BATCH_SIZE) * TEST_TRAIN_SPLIT)
train = dataset.take(train_size)
test = dataset.skip(train_size)

# Train model
history = model.fit(
    train,
    validation_data=test,
    epochs=2
)

Inside dataset -> <BatchDataset shapes: ((16, 128), (16, 128), (16, 5)), types: (tf.float64, tf.float64, tf.float64)>

Inside train -> <TakeDataset shapes: ((16, 128), (16, 128), (16, 5)), types: (tf.float64, tf.float64, tf.float64)>

Data example:

enter image description here

Any help appreciated. I'm new to transformers so please feel free to point any extra considerations.


Solution

  • So I managed to fix this as far as I know with the help of @Djinn. I did remove dataset API and instead built my own datasets manually using the following code:

    # Split into training and validation sets
    train_size = int(samples_count * TEST_TRAIN_SPLIT)
    train = [ids[:train_size], mask[:train_size]]
    train_labels = labels[:train_size]
    test = [ids[train_size:], mask[train_size:]], labels[train_size:]
    # Train model
    history = model.fit(
        train, train_labels,
        validation_data=test,
        epochs=10
    )
    

    This seems to be working and fit() accepted this data, but feel free to point out if this is wrong or could be made differently.