Search code examples
pythontensorflowkerasconv-neural-networkmnist

Difficulty with stacking MNIST and Fashion_MNIST


I know it's basic and too easy for you people, but I'm a beginner who needs your help. I'm struggling to make binary classifier with CNN. My final goal is to check accuracy over 0.99

I import both MNIST and FASHION_MNIST to identify if it's number or clothing. So there are 2 category. I want to categorize 0-60000 as 0, and 60001-120000 as 1. I will use binary_crossentropy.

but I dont know how to start from the beginning. How can I use vstack hstack at first to combine MNIST and FASHION_MNIST?

This is how I tried so far

import numpy as np
from keras.datasets import mnist
from keras.datasets import fashion_mnist
import keras
import tensorflow as tf
from keras.utils.np_utils import to_categorical
num_classes = 2
train_images = train_images.astype("float32") / 255
test_images = test_images.astype("float32") / 255
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))
train_labels = to_categorical(train_labels, num_classes)
test_labels = to_categorical(test_labels, num_classes)

Solution

  • First of all

    They're images so better treat them as images and don't reshape them to vectors.

    Now the answer of the question. Suppose you have mnist_train_image and fashion_train_image, both have (60000, 28, 28) input shape.

    What you want to do is consist of 2 parts, combining inputs and making the targets.

    First the inputs

    As you've already wrote in the question, you can use np.vstack like this

    >>> train_image = np.vstack((fashion_train_image, mnist_train_image))
    >>> train_image.shape
    (120000, 28, 28)
    

    But as you should have already noticed, remembering whether you need vstack or dstack or hstack is kinda a pain. My preference is that I'd use np.concatenate instead

    >>> train_image = np.concatenate((fashion_train_image, mnist_train_image), axis=0)
    >>> train_image.shape
    (120000, 28, 28)
    

    Now instead of remembering what the duck are v or h or d you just need to remember the axis (or dimension) you want to concatenate, in this case it's the first axis which means 0. Especially in case like this one where the "vertical" is the second axis because it's a stack of images and the first axis is "batch".

    Next, the labels

    Since you want to categorize 0-60000 as 0, and 60001-120000 as 1, there's a lot of fancy ways to do this.

    But in a nutshell you can use np.zeros to create an array filled with 0. And np.ones to, you guess it, create an array filled with 1. But as both ones and zeros give you an array of float and I'm not sure this will become a problem or not so I add .astype('uint8') in the back just in case. You can add parameter dtype='uint8' in the function too.

    Use the concatenate from above

    >>> train_labels = np.concatenate((np.zeros(60000), np.ones(60000))).astype('uint8')
    >>> train_labels.shape
    (120000,)
    

    Use ones or zeros for the whole size and subtract or add or reassign the rest

    >>> train_labels = np.zeros(120000).astype('uint8')
    >>> train_labels[60000:] = 1
    #####
    >>> train_labels = np.ones(120000, dtype='uint8')
    >>> train_labels[:60000] -= 1
    

    Important!!!!

    There's a noticeable mistake in your example about the label, the index start with 0 so the 60,000th index is 59,999.

    So what you actually want is categorize 0-59999 as 0, and 60000-119999 as 1.