Search code examples
pythonartificial-intelligencetensorflow2.0

Tensorflow creating images with AI


I'm a beginner in datascience and tensorflow, so, as a test of my "skills" I wanted to try and make an AI that you give a number to and then gives back a 28x28 pixel image of that number. It is possible to do this the other way around, so I figured, why not? So the code works pretty well actually, but the accuracy of the AI is very low, so low in fact that it just returns random pixels. Is there any way to make this AI more accurate, apart from maybe doing like 100 epochs or something? Heres the code I'm using:

import tensorflow as tf
import tensorflow.keras as tk

import numpy as np
import matplotlib.pyplot as plt

(train_data, train_labels), (test_data, test_labels) = tk.datasets.mnist.load_data()

model = tk.Sequential([
                   tk.layers.Dense(64, activation='relu'),
                   tk.layers.Dense(64, activation='relu'),
                   tk.layers.Dense(784, activation='relu')])
                  
history = model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics=['acc'])

train_data = np.reshape(train_data, (60000, 784))
test_data = np.reshape(test_data, (-1, 784))

model.fit(train_labels, train_data, epochs=10, validation_data=(test_labels, test_data))

result = model.predict([2])

result = np.reshape(result, (28, 28))

plt.imshow(result)

plt.show()

I'm using google.colab since I havent yet been able to install tensorflow in my computer, maybe it has something to do with that. Thanks for any answers in advance!


Solution

  • This is very much possible, and has resulted in a vast area of research called Generative Adversarial Networks (GANs).

    First off, let me list the problems with your approach:

    1. You use a single number as input and expect the model to understand it. This does not practically work. It's better to use a representation called one-hot encoding.

    2. For each label, multiple images exist. Mathematically, if the domain X consists of labels (one-hot encoded or not) and the range Y consists of images, the relationship is a one-to-many relationship, which can't be learned in a supervised fashion. By its very nature, supervised learning can be used to model only many-to-one relationships, or one-to-one relationships (although there is no point in using ML for this; dictionaries are a much better choice). Therefore, it is easy to predict labels for images, but impossible to generate images for labels using fully supervised approaches, unless you use only one image per label for training.

    The way GANs solve the problem in (2) is by generating a "probable" image given a set of random values. Variations of GANs allow specifying the exact value to generate.

    I suggest reading the paper that introduced GANs. Then try out the basic GAN before moving on to generate specific numbers.