I am training a model with kearas on Go of datas, at a point where my computer can't handle the RAM needed. So I am trying to implement my training as 1 epoche is done with multiple model.fit calls, with somthing like :
for epoche in range(nbEpoches):
for index_df in range(len(list_of_dataFrames)):
dataFrame = load_dataFrame(list_of_dataFrames, index_df) # load in ram only this DF
X_train, Y_train, X_test, Y_test = calc_train_arrays(dataFrame)
model.fit(
X_train, Y_train,
validation_data=(X_test, Y_test),
# ... what I am asking
batch_size=batch_size,
)
and with X_train and X_test are numpy arrays of shape (many thousands, 35 to 200, 54+), so using multiple batches is mandatory (for the GPU's VRAM), and dynamicly loading the dataFrames too (for the RAM), this is what force me to use multiple fit calls for the same epoche.
I am asking how to use the model.fit function in order to do it.
i also wondered if using a generator of array of shape (batch_size, 35+, 54+) and specifing steps_per_epoch could be an idea ?
i have first tryed to avoid the problem by just training on a single dataFrame of around 20k samples, but the model is having generalisation issue. I also tryed to just do one epoche per dataframe, but it seems like each dataframe was un-learning the others.
I guess you have 2 options.
You can try a custom data generator. Here is an tutorial (i think this may be a little difficult): https://medium.com/analytics-vidhya/write-your-own-custom-data-generator-for-tensorflow-keras-1252b64e41c3
You can also define a custom training loop, here is a tutorial: https://www.tensorflow.org/guide/keras/writing_a_training_loop_from_scratch
I am not sure if this is what you want.