Hey guys so I'm building a model based on the Roberta-Base and at the end when I try to fit the model I get a error saying: ValueError: Layer model_39 expects 2 input(s), but it received 1 input tensors. Inputs received: [<tf.Tensor 'IteratorGetNext:0' shape=(16, 128) dtype=float64>]
I'm using tf.data.Dataset
to make the dataset:
def map_dataset(ids, masks, labels):
return {'input_ids': ids, 'input_mask': masks}, labels
# Create dataset
dataset = tf.data.Dataset.from_tensor_slices((ids, mask, labels))
dataset.map(map_dataset)
dataset = dataset.shuffle(10000).batch(BATCH_SIZE, drop_remainder=True)
Supposedly dataset is generating 2 inputs properly but for some reason fit is refusing to work and I'm not sure why.
Full code:
LEN_SEQ = 128
BATCH_SIZE = 16
TEST_TRAIN_SPLIT = 0.9
TRANSFORMER = 'roberta-base'
# Load roberta model
base_model = TFAutoModel.from_pretrained('roberta-base')
for layer in base_model.layers:
layer.trainable = False
# Define input layers
input_ids = tf.keras.layers.Input(shape=(LEN_SEQ,), name='input_ids', dtype='int32')
input_mask = tf.keras.layers.Input(shape=(LEN_SEQ,), name='input_mask', dtype='int32')
# Define hidden layers
embedding = base_model([input_ids, input_mask])[1]
layer = tf.keras.layers.Dense(LEN_SEQ * 2, activation='relu')(embedding)
layer = tf.keras.layers.Dense(LEN_SEQ, activation='relu')(layer)
# Define output
output = tf.keras.layers.Dense(1, activation='softmax', name='output')(layer)
model = tf.keras.Model(inputs=[input_ids, input_mask], outputs=[output])
model.compile(
optimizer = Adam(learning_rate=1e-3, decay=1e-4),
loss = CategoricalCrossentropy(),
metrics = [
CategoricalAccuracy('accuracy')
]
)
# Load data
df = pd.read_csv('train-processed.csv')
df = df.head(100)
samples_count = len(df)
# Tokenize data
tokenizer = AutoTokenizer.from_pretrained(TRANSFORMER)
tokens = tokenizer(
df['first_Phrase'].tolist(),
max_length=LEN_SEQ,
truncation=True,
padding='max_length',
add_special_tokens=True,
return_tensors='tf'
)
ids = tokens['input_ids']
mask = tokens['attention_mask']
def map_dataset(ids, masks, labels):
return {'input_ids': ids, 'input_mask': masks}, labels
# Create dataset
dataset = tf.data.Dataset.from_tensor_slices((ids, mask, labels))
dataset.map(map_dataset)
dataset = dataset.shuffle(10000).batch(BATCH_SIZE, drop_remainder=True)
# Split data intro train and test
train_size = int((samples_count / BATCH_SIZE) * TEST_TRAIN_SPLIT)
train = dataset.take(train_size)
test = dataset.skip(train_size)
# Train model
history = model.fit(
train,
validation_data=test,
epochs=2
)
Inside dataset -> <BatchDataset shapes: ((16, 128), (16, 128), (16, 5)), types: (tf.float64, tf.float64, tf.float64)>
Inside train -> <TakeDataset shapes: ((16, 128), (16, 128), (16, 5)), types: (tf.float64, tf.float64, tf.float64)>
Data example:
Any help appreciated. I'm new to transformers so please feel free to point any extra considerations.
So I managed to fix this as far as I know with the help of @Djinn. I did remove dataset API and instead built my own datasets manually using the following code:
# Split into training and validation sets
train_size = int(samples_count * TEST_TRAIN_SPLIT)
train = [ids[:train_size], mask[:train_size]]
train_labels = labels[:train_size]
test = [ids[train_size:], mask[train_size:]], labels[train_size:]
# Train model
history = model.fit(
train, train_labels,
validation_data=test,
epochs=10
)
This seems to be working and fit()
accepted this data, but feel free to point out if this is wrong or could be made differently.