To learn the image classification problem, I want to load data using the tf.data.Dataset class.
The input_X value is a list with file paths.
ex. [/path/to/image1.jpg, /path/to/image2.jpg, ...]
The input_Y value is a list of label names that match input_X.
ex. ['cat', 'cat', 'dog', ...]
I want to load data using tf.data.Dataset.from_generator
function and my generator function.
Below is my generator
code:
def get_generator(self, dataset, full_path, category_names, input_shape):
images = dataset[0]
labels = dataset[1]
ih, iw = input_shape[0:2]
images = self.attach_fullpath(images, full_path)
for data in zip(images, labels):
imagename, labelname = data
image = cv2.imread(imagename)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = tf.image.resize(image, [ih, iw])
image = tf.cast(image, tf.float32)
image = image / 255.0
label = tf.one_hot(category_names.index(labelname), depth=len(category_names))
label = tf.cast(label, tf.float32)
yield (image, label)
If I do next(generator)
, I get the following output:
# X data
tf.Tensor(
[[[0.31978127 0.20193568 0.20587036]
[0.33178368 0.21166316 0.20612068]
[0.34334144 0.2161987 0.19959025]
...
[0.42936707 0.2707099 0.1780539 ]
[0.41268054 0.26256576 0.16227722]
[0.4328849 0.28778687 0.18582608]]], shape=(299, 299, 3), dtype=float32)
# Y data
tf.Tensor([0. 0. 1.], shape=(3,), dtype=float32)
After defining this generator, I defined the dataset using the tf.data.Dataset.from_generator
function as follows:
dataset = tf.data.Dataset.from_generator(self.get_generator,
(tf.float32, tf.float32),
args=(train_data,
self.full_path,
['omelette', 'pizza', 'samosa'],
(299, 299, 3)))
dataset = dataset.batch(8)
Here, the full_path value is a part of the full path entered to make the path of the image an absolute path.
And, train_data[0] is a list of the paths of the images mentioned above,
and train_data[1] is a list of label names that match the paths of the images mentioned above.
However, I got these errors:
Can't convert Python sequence with mixed types to Tensor.
How can i solve this error?
I took a peek at the tensorflow docs - full discloser that I haven't used from_generator
before. I believe the generator should also be yielding tensors. You probably want to do 1-of-k encoding on the target "y" labels (e.g. "cat" is 0 and "dog" is 1), and just enforce some sort of ordering to match the filepaths to the corresponding images.