Search code examples
pythonkerasmnistimage-classification

How to load dataset for in Keras using Python?


I'm a beginner to learn Keras using Python.

I've read some sample code of dataset loading using MNIST Dataset.

I don't understand the variable (X_train, y_train) and (X_test, y_test).

Please, help me explaining the purpose of these variables.

Also, what type of data are assigned to these variables?

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils

# Load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Solution

  • The Mnist Dataset contains about 75 000 sample images of Handwritten Digits. Each Digit also carries a label which contains the digit which can be seen in the image. Each Image has the size 28x28 pixels. These Images are getting split into two sections. The training-Images and the Test-Images. You use the training-images to train your Model. And then you validate your accuracy and loss by testing how good the resulting neuronal network is working on the till then unused and unseen test-images.

    (X_train, Y_train) is a tuple, a combination of two values stored in one variable/list-element...

    The Images then are stored in these Lists as arrays. So X_train contains about 60 000 arrays of the size of 784 (28*28). Each cell represents the value of one pixel. It can be anything from 0 (white) to 255 (black)

    X_test contains a list with about 15 000 such arrays. The Labels fitting the images are stored in the belonging Y_train/Y_test