I am working on a keras denoising neural network that denoise high Dimension x-ray images. The idea is to train on some datasets eg.1,2,3 and after having the weights, another datasets eg.4,5,6 will start with a new training with weights initialized from the previous training. Implementation-wise it works, however the weights resulted from the last rotation perform better only on the datasets that were used to train on in this rotation. Same goes for other rotation.
In other words, weights resutlted from training on dataset: 4,5,6 doesn't give the good results on an image of dataset 1 as intended as the weights that were trained on datasets: 1,2,3. which shouldn't be what I intend to do
The idea is that weights should be tweaked to work with all datasets effectively, as training on the whole dataset doesn't fit into memory.
I tried other solutions such as creating custom generator that takes images from disk and do the training as batches which is very slow as it depends on factors like I/O operations happening on disk or the time complexity of processing functions happening inside the custom keras generator!
Below is a code that shows what I am doing. I have 12 datasets, seperated into 4 checkpoints. data is loaded and training goes and saves final model to an array and next training takes the weights from the previous rotation and continues.
EPOCHES = 150
NUM_CHKPTS = 4
weights = []
for chk in range(1,NUM_CHKPTS+1):
log_dir = os.path.join(os.getcwd(), 'resnet_checkpts_' + str(EPOCHES) + "_tl2_chkpt" + str(chk))
if not os.path.isdir(log_dir):
os.makedirs(log_dir)
else:
print('Training log directory already exists @ {}.'.format(log_dir))
tb_output = TensorBoard(log_dir=log_dir, histogram_freq=1)
print("Loading Data From CHKPT #" + str(chk))
h5f = h5py.File('C:\\autoencoder\\datasets\\mix\\chk' + str(chk) + '.h5','r')
org_patch = h5f['train_data'][:]
noisy_patch = h5f['train_noisy'][:]
h5f.close()
input_patch, test_patch, noisy_patch, test_noisy_patch = train_test_split(org_patch, noisy_patch, train_size=0.8, shuffle=True)
print("Reshaping")
train_data = np.array([np.reshape(input_patch[i], (52, 52, 1)) for i in range(input_patch.shape[0])], dtype = np.float32)
train_noisy_data = np.array([np.reshape(noisy_patch[i], (52, 52, 1)) for i in range(noisy_patch.shape[0])], dtype = np.float32)
test_data = np.array([np.reshape(test_patch[i], (52, 52, 1)) for i in range(test_patch.shape[0])], dtype = np.float32)
test_noisy_data = np.array([np.reshape(test_noisy_patch[i], (52, 52, 1)) for i in range(test_noisy_patch.shape[0])], dtype = np.float32)
print('Number of training samples are:', train_data.shape[0])
print('Number of test samples are:', test_data.shape[0])
# IN = np.ones((len(XTRAINFILES), 52, 52, 1 ))
if chk == 1:
print("Generating the Model For The First Time..")
autoencoder_model = model_autoencoder(train_noisy_data)
print("Done!")
else:
autoencoder_model=load_model(weights[chk-2])
checkpt_path = log_dir + r"\\cp-{epoch:04d}.ckpt"
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpt_path, verbose=0, save_weights_only=True, save_freq='epoch')
optimizer = tf.keras.optimizers.Adam(lr=0.0001)
autoencoder_model.compile(loss='mse',optimizer=optimizer)
autoencoder_model.fit(train_noisy_data, train_data,
batch_size=128,
epochs=EPOCHES, shuffle=True, verbose=1,
validation_data=(test_noisy_data, test_data),
callbacks=[tb_output, checkpoint_callback])
weight_dir = log_dir+'\\model_resnet_new_OL' + str(EPOCHES) + 'epochs.h5'
weights.append(weight_dir)
autoencoder_model.save(weight_dir) # Defined saved model name by number of epochs.
Your model will forget previous dataset as you train on new dataset.
I read in reinforcement learning, when game are used to train Deep Reinforcement Learning (DRL), then you have to create memory replay, which collect data from different rounds of game, because each round of game has different data, then randomly some of that data is chosen to train model. that way DRL model can learn to play different rounds of game without forgetting previous rounds.
You can try to create a single dataset by taking some random samples from each dataset.
When you train model on new dataset that make sure data from all previous rotation are in current rotation.
Also in transfer learning, when you train model on new dataset, you have to freeze previous layers so that model don`t forget previous training. you are not using transfer learning but still when you start training on 2nd dataset your 1st dataset will slowly be removed from memory of weights.
you can try freezing initial layers of decoder so that they are not updated when extracting feature, assuming all of the dataset contain similar images, that way your model will not forget previous training as in transfer learning. but still when you train on new dataset previous will be forgotten.