Search code examples
pythonnumpytensorflowkeras

Simple tensorflow/keras model with one hot vector output gives error message


I am trying to make a simple multi-layer perceptron classifier to begin with, before elaborating on the model, however I am having trouble matching the output layer to the labelled data. The most concise code that shows the behavior is:

import numpy as np
import tensorflow as tf

# Generate example dataset
X = np.random.rand(1000, 1)  # One-dimensional input
y = np.random.randint(0, 10, size=(1000,))  # Random integer labels
X_val = np.random.rand(1000, 1)  # One-dimensional input
y_val = np.random.randint(0, 10, size=(1000,))  # Random integer labels

# Convert integer labels to one-hot vectors
y_one_hot = tf.one_hot(y, depth=10)  # Assuming 10 classes
y_val_one_hot = tf.one_hot(y_val, depth=10)

# Create a zipped dataset
train_dataset = tf.data.Dataset.from_tensor_slices((X, y_one_hot))
val_dataset = tf.data.Dataset.from_tensor_slices((X_val, y_val_one_hot))

# Define batch size
batch_size = 32

# Shuffle and batch the training dataset
train_dataset = train_dataset.shuffle(buffer_size=100).batch(batch_size)

# Define the MLP model
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')  # Output layer with softmax activation for one-hot vectors
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=[tf.keras.metrics.Precision()])

# Train the model
model.fit(train_dataset, epochs=10, validation_data=val_dataset)

The error message given is

File "/home/vboxuser/.local/lib/python3.11/site-packages/keras/src/backend.py", line 5573, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (10, 1) and (1, 10) are incompatible

I have tried using tf.transpose and tf.reshape to no avail


Solution

  • Note that the validation_data must be one of the following:

    • A tuple (x_val, y_val) of Numpy arrays or tensors.
    • A tuple (x_val, y_val, val_sample_weights) of NumPy arrays.
    • A tf.data.Dataset.
    • A Python generator or keras.utils.Sequence returning (inputs, targets) or (inputs, targets, sample_weights).

    The data created from tf.data.Dataset.from_tensor_slices does not match any of the options given above.

    The best case scenario is to simply use a tuple. Thus have:

      val_dataset = (X_val, y_val_one_hot)
    

    and run your program:


    Test your program:

    # Generate example dataset
    X = np.random.rand(1000, 1)  # One-dimensional input
    y = X.ravel()*1000//100  # integer labels
    X_val = np.random.rand(1000, 1)  # One-dimensional input
    y_val = (X_val.ravel()*1000//100)  # integer labels
    
    # Convert integer labels to one-hot vectors
    y_one_hot = tf.one_hot(y, depth=10)  # Assuming 10 classes
    y_val_one_hot = tf.one_hot(y_val, depth=10)
    
    # Create a zipped dataset
    train_dataset = tf.data.Dataset.from_tensor_slices((X, y_one_hot))
    val_dataset = (X_val, y_val_one_hot) # THE CHANGE
    
    # Define batch size
    batch_size = 32
    
    # Shuffle and batch the training dataset
    train_dataset = train_dataset.shuffle(buffer_size=100).batch(batch_size)
    
    # Define the MLP model
    model = tf.keras.models.Sequential([
        tf.keras.layers.Dense(64, activation='relu', input_shape=(1,)),
        tf.keras.layers.Dense(32, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')  # Output layer with softmax activation for one-hot vectors
    ])
    
    # Compile the model
    model.compile(optimizer='adam',
                  loss='categorical_crossentropy',
                  metrics=[tf.keras.metrics.Precision()])
    
    # Train the model
    model.fit(train_dataset, epochs=20, validation_data = val_dataset)