Simple tensorflow/keras model with one hot vector output gives error message

I am trying to make a simple multi-layer perceptron classifier to begin with, before elaborating on the model, however I am having trouble matching the output layer to the labelled data. The most concise code that shows the behavior is:

import numpy as np
import tensorflow as tf

# Generate example dataset
X = np.random.rand(1000, 1)  # One-dimensional input
y = np.random.randint(0, 10, size=(1000,))  # Random integer labels
X_val = np.random.rand(1000, 1)  # One-dimensional input
y_val = np.random.randint(0, 10, size=(1000,))  # Random integer labels

# Convert integer labels to one-hot vectors
y_one_hot = tf.one_hot(y, depth=10)  # Assuming 10 classes
y_val_one_hot = tf.one_hot(y_val, depth=10)

# Create a zipped dataset
train_dataset = tf.data.Dataset.from_tensor_slices((X, y_one_hot))
val_dataset = tf.data.Dataset.from_tensor_slices((X_val, y_val_one_hot))

# Define batch size
batch_size = 32

# Shuffle and batch the training dataset
train_dataset = train_dataset.shuffle(buffer_size=100).batch(batch_size)

# Define the MLP model
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')  # Output layer with softmax activation for one-hot vectors
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=[tf.keras.metrics.Precision()])

# Train the model
model.fit(train_dataset, epochs=10, validation_data=val_dataset)

The error message given is

File "/home/vboxuser/.local/lib/python3.11/site-packages/keras/src/backend.py", line 5573, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (10, 1) and (1, 10) are incompatible

I have tried using tf.transpose and tf.reshape to no avail

Solution

Note that the validation_data must be one of the following:

A tuple (x_val, y_val) of Numpy arrays or tensors.
A tuple (x_val, y_val, val_sample_weights) of NumPy arrays.
A tf.data.Dataset.
A Python generator or keras.utils.Sequence returning (inputs, targets) or (inputs, targets, sample_weights).

The data created from tf.data.Dataset.from_tensor_slices does not match any of the options given above.

The best case scenario is to simply use a tuple. Thus have:

  val_dataset = (X_val, y_val_one_hot)

and run your program:

Test your program:

# Generate example dataset
X = np.random.rand(1000, 1)  # One-dimensional input
y = X.ravel()*1000//100  # integer labels
X_val = np.random.rand(1000, 1)  # One-dimensional input
y_val = (X_val.ravel()*1000//100)  # integer labels

# Convert integer labels to one-hot vectors
y_one_hot = tf.one_hot(y, depth=10)  # Assuming 10 classes
y_val_one_hot = tf.one_hot(y_val, depth=10)

# Create a zipped dataset
train_dataset = tf.data.Dataset.from_tensor_slices((X, y_one_hot))
val_dataset = (X_val, y_val_one_hot) # THE CHANGE

# Define batch size
batch_size = 32

# Shuffle and batch the training dataset
train_dataset = train_dataset.shuffle(buffer_size=100).batch(batch_size)

# Define the MLP model
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')  # Output layer with softmax activation for one-hot vectors
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=[tf.keras.metrics.Precision()])

# Train the model
model.fit(train_dataset, epochs=20, validation_data = val_dataset)