I am trying to make a simple multi-layer perceptron classifier to begin with, before elaborating on the model, however I am having trouble matching the output layer to the labelled data. The most concise code that shows the behavior is:
import numpy as np
import tensorflow as tf
# Generate example dataset
X = np.random.rand(1000, 1) # One-dimensional input
y = np.random.randint(0, 10, size=(1000,)) # Random integer labels
X_val = np.random.rand(1000, 1) # One-dimensional input
y_val = np.random.randint(0, 10, size=(1000,)) # Random integer labels
# Convert integer labels to one-hot vectors
y_one_hot = tf.one_hot(y, depth=10) # Assuming 10 classes
y_val_one_hot = tf.one_hot(y_val, depth=10)
# Create a zipped dataset
train_dataset = tf.data.Dataset.from_tensor_slices((X, y_one_hot))
val_dataset = tf.data.Dataset.from_tensor_slices((X_val, y_val_one_hot))
# Define batch size
batch_size = 32
# Shuffle and batch the training dataset
train_dataset = train_dataset.shuffle(buffer_size=100).batch(batch_size)
# Define the MLP model
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax') # Output layer with softmax activation for one-hot vectors
])
# Compile the model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=[tf.keras.metrics.Precision()])
# Train the model
model.fit(train_dataset, epochs=10, validation_data=val_dataset)
The error message given is
File "/home/vboxuser/.local/lib/python3.11/site-packages/keras/src/backend.py", line 5573, in categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
ValueError: Shapes (10, 1) and (1, 10) are incompatible
I have tried using tf.transpose and tf.reshape to no avail
Note that the validation_data
must be one of the following:
(x_val, y_val)
of Numpy arrays or tensors.(x_val, y_val, val_sample_weights)
of NumPy arrays.tf.data.Dataset
.keras.utils.Sequence
returning
(inputs, targets)
or (inputs, targets, sample_weights)
.The data created from tf.data.Dataset.from_tensor_slices
does not match any of the options given above.
The best case scenario is to simply use a tuple. Thus have:
val_dataset = (X_val, y_val_one_hot)
and run your program:
Test your program:
# Generate example dataset
X = np.random.rand(1000, 1) # One-dimensional input
y = X.ravel()*1000//100 # integer labels
X_val = np.random.rand(1000, 1) # One-dimensional input
y_val = (X_val.ravel()*1000//100) # integer labels
# Convert integer labels to one-hot vectors
y_one_hot = tf.one_hot(y, depth=10) # Assuming 10 classes
y_val_one_hot = tf.one_hot(y_val, depth=10)
# Create a zipped dataset
train_dataset = tf.data.Dataset.from_tensor_slices((X, y_one_hot))
val_dataset = (X_val, y_val_one_hot) # THE CHANGE
# Define batch size
batch_size = 32
# Shuffle and batch the training dataset
train_dataset = train_dataset.shuffle(buffer_size=100).batch(batch_size)
# Define the MLP model
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax') # Output layer with softmax activation for one-hot vectors
])
# Compile the model
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=[tf.keras.metrics.Precision()])
# Train the model
model.fit(train_dataset, epochs=20, validation_data = val_dataset)