Search code examples
pythontensorflowtensorflow-datasets

Tensorflow2 Error : Can't convert Python sequence with mixed types to Tensor


To learn the image classification problem, I want to load data using the tf.data.Dataset class.

The input_X value is a list with file paths.
ex. [/path/to/image1.jpg, /path/to/image2.jpg, ...]

The input_Y value is a list of label names that match input_X.
ex. ['cat', 'cat', 'dog', ...]

I want to load data using tf.data.Dataset.from_generator function and my generator function.

Below is my generator code:

def get_generator(self, dataset, full_path, category_names, input_shape):
    images = dataset[0]
    labels = dataset[1]
    ih, iw = input_shape[0:2]

    images = self.attach_fullpath(images, full_path)
    for data in zip(images, labels):
        imagename, labelname = data

        image = cv2.imread(imagename)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        image = tf.image.resize(image, [ih, iw])
        image = tf.cast(image, tf.float32)
        image = image / 255.0

        label = tf.one_hot(category_names.index(labelname), depth=len(category_names))
        label = tf.cast(label, tf.float32)

    yield (image, label)

If I do next(generator), I get the following output:

# X data
tf.Tensor(
[[[0.31978127 0.20193568 0.20587036]
  [0.33178368 0.21166316 0.20612068]
  [0.34334144 0.2161987  0.19959025]
  ...
  [0.42936707 0.2707099  0.1780539 ]
  [0.41268054 0.26256576 0.16227722]
  [0.4328849  0.28778687 0.18582608]]], shape=(299, 299, 3), dtype=float32)

# Y data
tf.Tensor([0. 0. 1.], shape=(3,), dtype=float32)

After defining this generator, I defined the dataset using the tf.data.Dataset.from_generator function as follows:

dataset = tf.data.Dataset.from_generator(self.get_generator, 
                                         (tf.float32, tf.float32),
                                         args=(train_data, 
                                               self.full_path, 
                                               ['omelette', 'pizza', 'samosa'], 
                                               (299, 299, 3)))
dataset = dataset.batch(8)

Here, the full_path value is a part of the full path entered to make the path of the image an absolute path.

And, train_data[0] is a list of the paths of the images mentioned above,
and train_data[1] is a list of label names that match the paths of the images mentioned above.

However, I got these errors:

Can't convert Python sequence with mixed types to Tensor.

How can i solve this error?


Solution

  • I took a peek at the tensorflow docs - full discloser that I haven't used from_generator before. I believe the generator should also be yielding tensors. You probably want to do 1-of-k encoding on the target "y" labels (e.g. "cat" is 0 and "dog" is 1), and just enforce some sort of ordering to match the filepaths to the corresponding images.