Search code examples
python-3.xtensorflowmachine-learningkeras

Tensorflow/Keras model not fitting data


I'm working on a project with Tensorflow and Keras, but I'm running into an error with the shape of my data. When I run the model.fit() portion of the code I get this error below. I know I need to reshape the data, but I'm not sure where exactly I need to reshape.

Here is the head of my dataframe:

House   Q1  Q2  Q3  Q4  Q5  Q6  Q7
0   slytherin   ask for more stories    black visions goblet    cold    ghosts  worried about mental health call dr forest  left
1   slytherin   ask for more stories    black visions goblet    hunger  superstrength   nightmare silly voice   moon    black
2   slytherin   ask for more stories    fresh parchment being ignored   every area of magic volunteer to fight  moon    tails
3   slytherin   ask for more stories    golden sunspots potion  boredom merpeople   silly voice dusk    tails
4   slytherin   ask for more stories    golden sunspots potion  feared  vampires    draw wand and stand ground  dawn    white

df.shape (1209, 8)

from sklearn.model_selection import train_test_split
from tensorflow import keras
from tensorflow.keras import layers

Input:
X = df[['Q1', 'Q2', 'Q3', 'Q4', 'Q5', 'Q6', 'Q7']]
y = df['House']

# splitting data 75% train
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)
print('Y_train shape:', y_train.shape)
print('Y_test shape:', y_test.shape)

Output:
X_train shape: (967, 7)
X_test shape: (242, 7)
Y_train shape: (967,)
Y_test shape: (242,)

Input:
model = keras.Sequential([
    layers.Dense(64, activation='relu', input_shape=[7]),
    layers.Dense(32, activation='relu'),
    layers.Dense(16, activation='relu'),
    layers.Dense(4, activation='softmax')
])
model.summary()

Output:
Model: "sequential_15"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_57 (Dense)            (None, 64)                512       
                                                                 
 dense_58 (Dense)            (None, 32)                2080      
                                                                 
 dense_59 (Dense)            (None, 16)                528       
                                                                 
 dense_60 (Dense)            (None, 4)                 68        
                                                                 
=================================================================
Total params: 3,188
Trainable params: 3,188
Non-trainable params: 0
_________________________________________________________________

Input:
model.compile(
    optimizer='adam', 
    loss='categorical_crossentropy', 
    metrics=['accuracy'])

model.fit(
    X_train, 
    y_train, 
    epochs=50, 
    validation_data=(X_test, y_test))

Output:
  File "/usr/local/lib/python3.9/site-packages/keras/backend.py", line 5559, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (None, 1) and (None, 4) are incompatible

I went back and changed my output layer to 1 and get the below error.

SyntaxWarning: In loss categorical_crossentropy, expected y_pred.shape to be (batch_size, num_classes) with num_classes > 1. Received: y_pred.shape=(None, 1). Consider using 'binary_crossentropy' if you only have 2 classes.
  return dispatch_target(*args, **kwargs)
2023-04-29 15:53:30.502022: W tensorflow/core/framework/op_kernel.cc:1807] OP_REQUIRES failed at cast_op.cc:121 : UNIMPLEMENTED: Cast string to float is not supported

I'm a little confused here as well as I should have 4 classes not 2. If anyone has any direction or can clarify what I'm not able to see, I would greatly appreciate it. Thanks in advance!


Solution

  • First of all your Y is textual data so you need to convert it into sequences, using tokenizer.texts_to_sequnces(x_train), similarly for x_test and then padding pad_sequences(x_train, padding = 'post', maxlen = 100), similarly for x_test as well then you can give this padded x_train and X_test to your model, regarding the number of classes in the output layer, it depends on the number of unique numbers that you have in House column so, it would be model.add(Dense( set(df[house].values), activation = 'softmax').