I have a model setup with one input and two outputs. I am trying to use any of
So far all my attempts have run into errors, which I can only assume is based on the shape/types involved in my output from the generators I've tried setting up. What format should the output of such a generator be? I am also super open to suggestions for doing this differently You can download my whole notebook here if you'd like to look through
The input to the model is of shape
(None,)
And is of type
tf.string
I am able to get model output with
model(tf.constant(['Hello TensorFlow!']))
There are two output heads for the model, the first is of shape
(None, 128, 5)
The second is of shape
(None, 128, 3)
They both are of type
tf.float32
The loss for my model is sparse categorical crossentropy. (I want a softmax across 5 or 3 classes depending on the head, for each of the 128 outputs, with the None being there for the batch size). I believed for this the proper output format would be a tuple of batch_size instances of the following format
(input_string, (output_for_head1, output_for_head2))
where input_string is a string, output_for_head1 and output_for_head2 are both numpy arrays of shape (128) and type int.
Gets index out of bounds error- pretty sure this needs to be batched
Get error
Data is expected to be in format `x`, `(x,)`, `(x, y)`, or `(x, y, sample_weight)`, found: ((<tf.Tensor: shape=(), dtype=string, numpy=b'Ya Yeet'>, (<tf.Tensor: shape=(128,), dtype=int64, numpy=... ( a very long set of (128,) tensors which is too large to post here)
[[{{node PyFunc}}]]
[[IteratorGetNext]] [Op:__inference_train_function_95064]
Function call stack:
train_function
I figured out the solution to this using generators. I was able to first create a generator yielding numpy arrays that the model could be trained on directly, and then create a tf.data dataset from a slightly modified version of that generator.
The solution was to output just 3 numpy arrays per batch like
input_arr, (output_arr1, output_arr2)
the shape of each array was expanded to have the batch size on the left, rather than having a tuple of length batch_size.
The final generators looked like this
def text_data_generator(dataset_path, batch_size, input_text_col='text', output_classes_col='labels', classes=CLASSES, continuity_classes=CONTINUITY_CLASSES, pad_length=128, sep=' '):
while True:
for chunk in pd.read_csv(dataset_path, chunksize=batch_size):
#TODO : Should probably shuffle the dataset somehow
texts = chunk['text'].values
c_classes = np.stack(chunk['classes'].apply(lambda x : pad([classes.index(item) for item in x.split(sep)])).values)
c_continuity = np.stack(chunk['continuity'].apply(lambda x : pad([continuity_classes.index(item) for item in x.split(sep)])).values)
texts = np.array(texts)
c_classes = np.array(c_classes)
c_continuity = np.array(c_continuity)
yield texts, (c_classes, c_continuity)
and
def tf_text_data_generator(dataset_path, batch_size, input_text_col='text', output_classes_col='labels', classes=CLASSES, continuity_classes=CONTINUITY_CLASSES, pad_length=128, sep=' '):
for chunk in pd.read_csv(dataset_path, chunksize=batch_size):
texts = chunk['text'].values
c_classes = np.stack(chunk['classes'].apply(lambda x : pad([classes.index(item) for item in x.split(sep)])).values)
c_continuity = np.stack(chunk['continuity'].apply(lambda x : pad([continuity_classes.index(item) for item in x.split(sep)])).values)
texts = np.array(texts)
c_classes = np.array(c_classes)
c_continuity = np.array(c_continuity)
yield texts, (c_classes, c_continuity)
The model could be trained directly on an instance of text_data_generator. To train on the other generator I created a tf.data.Dataset by
def wrapped_gen():
return tf_text_data_generator("test.csv", 10)
dataset = tf.data.Dataset.from_generator(wrapped_gen, (tf.string, (tf.int64, tf.int64)))
which then can be passed directly to model.train just as the instantiated generator could be.