python neural-network jupyter-notebook mnist

fashion_mnist Data ML Accuracy Score is only 0.1

i am pretty new to ML and trying to do an typical fashion_mnist Classification. The Problem is that the accuracy Score after I run the code is only 0.1 and the loss is below 0. So i guess the ML is not learning but I dont know what the Problem is? Thx

from tensorflow.keras.datasets import fashion_mnist 
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

x_train = x_train.astype('float32')
print(type(x_train))
x_train =x_train.reshape(60000,784)
x_train = x_train / 255.0
x_test =x_test.reshape(60000,784)
x_test= x_test/ 255.0


from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

model= Sequential()
model.add(Dense(100, activation="sigmoid", input_shape=(784,)))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer='sgd', loss="binary_crossentropy", metrics=["accuracy"])

model.fit(
    x_train,
    y_train,
    epochs=10,
    batch_size=1000)

Output:

Solution

Multiple issues with your code -

You have some error in the reshape x_test = x_test.reshape(10000,784) as it has 10000 images only.
You are using a sigmoid activation in the first dense layer, which is not a good practice. Instead, use relu.
Your output Dense has only 1 node. You are working with a dataset that has 10 unique classes. The output has to be Dense(10). Please understand that even though the y_train has classes 0-10, a neural network can't predict integer values with a softmax or sigmoid activation. Instead what you are trying to do is predict the probability values for EACH of the 10 classes.
You are using the incorrect activation in the final layer for multi-class classification. Use softmax.
You are using the incorrect loss function. For multi-class classification use categorical_crossentropy. Since your output is a 10-dimensional probability distribution, but your y_train is a single value for each class label, you can use sparse_categorical_crossentropy instead which is the same thing but handles label encoded y.
Try using a better optimizer to avoid getting stuck in local minima, such as adam.
It's preferred to use CNNs for image data since a simple Dense layer will not be able to capture the spatial features that make up the image. Since the images are small (28,28) and this is a toy example, it's ok the way it is.

Please refer to this table for checking out what to use. You have to ensure you know what problem you are solving in the first place though.

In your case, you want to do a multi-class single label classification but you are instead doing a multi-class multi-label classification by using the incorrect loss and output layer activation.

from tensorflow.keras.datasets import fashion_mnist 
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

#Load data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

#Normalize
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

#Reshape
x_train = x_train.reshape(60000,784)
x_train = x_train / 255.0
x_test = x_test.reshape(10000,784)
x_test = x_test / 255.0

print('Data shapes->',[i.shape for i in [x_train, y_train, x_test, y_test]])

#Contruct computation graph
model = Sequential()
model.add(Dense(100, activation="relu", input_shape=(784,)))
model.add(Dense(10, activation="softmax"))

#Compile with loss as cross_entropy and optimizer as adam
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=["accuracy"])

#Fit model
model.fit(x_train, y_train, epochs=10, batch_size=1000)

Data shapes-> [(60000, 784), (60000,), (10000, 784), (10000,)]
Epoch 1/10
60/60 [==============================] - 0s 5ms/step - loss: 0.8832 - accuracy: 0.7118
Epoch 2/10
60/60 [==============================] - 0s 6ms/step - loss: 0.5125 - accuracy: 0.8281
Epoch 3/10
60/60 [==============================] - 0s 6ms/step - loss: 0.4585 - accuracy: 0.8425
Epoch 4/10
60/60 [==============================] - 0s 6ms/step - loss: 0.4238 - accuracy: 0.8547
Epoch 5/10
60/60 [==============================] - 0s 7ms/step - loss: 0.4038 - accuracy: 0.8608
Epoch 6/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3886 - accuracy: 0.8656
Epoch 7/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3788 - accuracy: 0.8689
Epoch 8/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3669 - accuracy: 0.8725
Epoch 9/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3560 - accuracy: 0.8753
Epoch 10/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3451 - accuracy: 0.8794

I am also adding a code for your reference with Convolutional layers, using categorical_crossentropy and functional API instead of Sequential. Please read the comments inline the code for more clarity. This should help you get an idea of some good practices when working with Keras.

from tensorflow.keras.datasets import fashion_mnist 
from tensorflow.keras import layers, Model, utils

#Load data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

#Normalize
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')

#Reshape
x_train = x_train.reshape(60000,28,28,1)
x_train = x_train / 255.0
x_test = x_test.reshape(10000,28,28,1)
x_test = x_test / 255.0

#Set y to onehot instead of label encoded
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)

#print([i.shape for i in [x_train, y_train, x_test, y_test]])

#Contruct computation graph
inp = layers.Input((28,28,1))
x = layers.Conv2D(32, (3,3), activation='relu', padding='same')(inp)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(32, (3,3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Flatten()(x)
out = Dense(10, activation='softmax')(x)

#Define model
model = Model(inp, out)

#Compile with loss as cross_entropy and optimizer as adam
model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=["accuracy"])

#Fit model
model.fit(x_train, y_train, epochs=10, batch_size=1000)

utils.plot_model(model, show_layer_names=False, show_shapes=True)