python-3.x keras generator tensorflow2.0

Is it possible to use a custom generator to train multi input architecture with keras tensorflow 2.0.0?

With TF 2.0.0, I can train an architecture with one input, I can train an architecture with one input using a custom generator, and I can train an architecture with two inputs. But I can't train an architecture with two inputs using a custom generator.

To keep it minimalist, here's a simple example, with no generator and no multiple inputs to start with:

from tensorflow.keras import layers, models, Model, Input, losses
from numpy import random, array, zeros

input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
fullModel = Model(inputs=input1, outputs=dense1)
fullModel.summary()

# Generate random examples:
nbSamples = 21
X_train = random.rand(nbSamples, 2)
Y_train = random.rand(nbSamples, 5)

batchSize = 4
fullModel.compile(loss=losses.LogCosh())
fullModel.fit(X_train, Y_train, epochs=10, batch_size=batchSize)

It's a simple dense layer which takes in input vectors of size 2. The randomly generated dataset contains 21 examples and the batch size is 4. Instead of loading all the data and giving them to model.fit(), we can also give a custom generator in input. The main advantage (for RAM consumption) of this is to load only batch by batch rather that the whole dataset. Here is a simple example with the previous architecture and a custom generator:

import json
# Save the last dataset in a file:
with open("./dataset1input.txt", 'w') as file:
    for i in range(nbSamples):
        example = {"x": X_train[i].tolist(), "y": Y_train[i].tolist()}
        file.write(json.dumps(example) + "\n")

def generator1input(datasetPath, batch_size, inputSize, outputSize):
    X_batch = zeros((batch_size, inputSize))
    Y_batch = zeros((batch_size, outputSize))
    i=0
    while True:
        with open(datasetPath, 'r') as file:
            for line in file:
                example = json.loads(line)
                X_batch[i] = array(example["x"])
                Y_batch[i] = array(example["y"])
                i+=1
                if i % batch_size == 0:
                    yield (X_batch, Y_batch)
                    i=0

fullModel.compile(loss=losses.LogCosh())
my_generator = generator1input("./dataset1input.txt", batchSize, 2, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=int(nbSamples/batchSize))

Here, the generator opens the dataset file, but loads only batch_size examples (not nbSamples examples) each time it is called and slides into the file while looping.

Now, I can build a simple functional architecture with 2 inputs, and no generator:

input1 = Input(shape=2)
dense1 = layers.Dense(5)(input1)
subModel1 = Model(inputs=input1, outputs=dense1)
input2 = Input(shape=3)
dense2 = layers.Dense(5)(input2)
subModel2 = Model(inputs=input2, outputs=dense2)
averageLayer = layers.average([subModel1.output, subModel2.output])
fullModel = Model(inputs=[input1, input2], outputs=averageLayer)
fullModel.summary()

# Generate random examples:
nbSamples = 21
X1 = random.rand(nbSamples, 2)
X2 = random.rand(nbSamples, 3)
Y = random.rand(nbSamples, 5)

fullModel.compile(loss=losses.LogCosh())
fullModel.fit([X1, X2], Y, epochs=10, batch_size=batchSize)

Until here, all models compile and run, but I'm not able to use a generator with the last architecture and its 2 inputs... By trying the following code (which should logically work in my opinion):

# Save data in a file:
with open("./dataset.txt", 'w') as file:
    for i in range(nbSamples):
        example = {"x1": X1[i].tolist(), "x2": X2[i].tolist(), "y": Y[i].tolist()}
        file.write(json.dumps(example) + "\n")

def generator(datasetPath, batch_size, inputSize1, inputSize2, outputSize):
    X1_batch = zeros((batch_size, inputSize1))
    X2_batch = zeros((batch_size, inputSize2))
    Y_batch = zeros((batch_size, outputSize))
    i=0
    while True:
        with open(datasetPath, 'r') as file:
            for line in file:
                example = json.loads(line)
                X1_batch[i] = array(example["x1"])
                X2_batch[i] = array(example["x2"])
                Y_batch[i] = array(example["y"])
                i+=1
                if i % batch_size == 0:
                    yield ([X1_batch, X2_batch], Y_batch)
                    i=0

fullModel.compile(loss=losses.LogCosh())
my_generator = generator("./dataset.txt", batchSize, 2, 3, 5)
fullModel.fit(my_generator, epochs=10, steps_per_epoch=(nbSamples//batchSize))

I obtain the following error:

File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training.py", line 729, in fit
    use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 224, in fit
    distribution_strategy=strategy)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 547, in _process_training_inputs
    use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py", line 606, in _process_inputs
    use_multiprocessing=use_multiprocessing)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\keras\engine\data_adapter.py", line 566, in __init__
    reassemble, nested_dtypes, output_shapes=nested_shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\ops\dataset_ops.py", line 540, in from_generator
    output_types, tensor_shape.as_shape, output_shapes)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in map_structure_up_to
    results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\data\util\nest.py", line 471, in <listcomp>
    results = [func(*tensors) for tensors in zip(*all_flattened_up_to)]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 1216, in as_shape
    return TensorShape(shape)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in __init__
    self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 776, in <listcomp>
    self._dims = [as_dimension(d) for d in dims_iter]
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 718, in as_dimension
    return Dimension(value)
File "C:\Anaconda\lib\site-packages\tensorflow_core\python\framework\tensor_shape.py", line 193, in __init__
    self._value = int(value)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'tuple'

As explain in the doc, x argument of model.fit() can be A generator or keras.utils.Sequence returning (inputs, targets), and The iterator should return a tuple of length 1, 2, or 3, where the optional second and third elements will be used for y and sample_weight respectively. Thus, I think that it can not take in input more than one generator. Perhaps multiple inputs are not possible with custom generator. Please, would you have an explanation? A solution?

(otherwise, it seems possible to go through tf.data.Dataset.from_generator() with a less custom approach, but I have difficulties to understand what to indicate in the output_signature argument)

[EDIT] Thank you for your response @Francis Tang. In fact, it's possible to use a custom generator, but it allowed me to understand that I just had to change the line:

yield ([X1_batch, X2_batch], Y_batch)

To:

yield (X1_batch, X2_batch), Y_batch

Nevertheless, it is indeed perhaps better to use tf.keras.utils.Sequence. But I find it a bit restrictive. In particular, I understand in the example given (as well as in most of the examples I could find about Sequence) that __init__() is first used to load the full dataset, which is against the interest of the generator. But maybe it was a particular example about Sequence(), and there is no need to use __init__() like that: you can directly read a file and load the desired batch into the __getitem__(). In this case, it seems to push to browse each time the data file, or else it is necessary to create a file per batch beforehand (not really optimal).

Solution

from tensorflow.python.keras.utils.data_utils import Sequence

class generator(Sequence):
    def __init__(self,filename,batch_size):
        data = pickle.load(open(filename,'rb'))
        self.X1 = data['X1']
        self.X2 = data['X2']
        self.y = data['y']
        self.bs = batch_size
    
    def __len__(self):
        return (len(self.y) - 1) // self.bs + 1
    
    def __getitem__(self,idx):
        start, end = idx * self.bs, (idx+1) * self.bs
        return (self.X1[start:end], self.X2[start:end]), self.y[start:end]

You need to write a class using Sequence: https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence