I want to get mean absolute error (MAE) for each split of data using 5-fold cross validation. I have built a custom model using Xception.
Hence, to try this, I coded the following:
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 32,
shuffle = True)
X_train, Y_train = next(train_gen)
#-----------------------------------------------------------------------
# Custom Model initiation:
base_model = Xception(input_shape = X_train.shape[1:], include_top = False, weights = 'imagenet')
base_model.trainable = True
model = Sequential()
model.add(base_model)
model.add(GlobalMaxPooling2D())
model.add(Flatten())
model.add(Dense(16, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
def mae_months(in_gt, in_pred):
return mean_absolute_error(boneage_div * in_gt, boneage_div * in_pred)
# Compile model
adam = Adam(learning_rate = 0.0005)
model.compile(loss = 'mse', optimizer = adam, metrics = [mae_months])
#-----------------------------------------------------------------------
# KFold
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
I coded up to KFold, but now I am stuck with proceeding to the cross validation step to get MAE for each data splits?
A post here suggests a for loop for each Kfold splits, but that's only if the model such as DecisionTreeRegressor() is used instead of a custom model using Xception like mine?
UPDATE
After referring to the suggestion below, I applied the code as follows after the using KFold:
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 1024,
shuffle = True)
...
...
...
mae_list = []
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(X_train, Y_train) # X_train, Y_train = next(train_gen) from above
for train, test in split:
x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
history = model.fit(x_train, y_train, validation_data = (x_test, y_test), batch_size = 16)
pred = model.predict(x_test, batch_size = 8)
err = mean_absolute_error(y_test, pred)
mae_list .append(err)
I set the batch size of train_gen
to like 1024 first then run the code above, however, I get the following error:
52/52 [==============================] - 16s 200ms/step - loss: 0.9926 - mae_months: 31.5353 - val_loss: 4.4153 - val_mae_months: 81.5463
52/52 [==============================] - 9s 172ms/step - loss: 0.4185 - mae_months: 21.4242 - val_loss: 0.7401 - val_mae_months: 29.3815
52/52 [==============================] - 9s 172ms/step - loss: 0.2930 - mae_months: 17.3729 - val_loss: 0.5628 - val_mae_months: 23.9055
9/52 [====>.........................] - ETA: 7s - loss: 0.2355 - mae_months: 16.7444
ResourceExhaustedError Traceback (most recent call last)
Input In [11], in <cell line: 9>()
10 x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
11 # model = boneage_model()
12 # history = model.fit(train_gen, validation_data = (x_test, y_test))
---> 13 history = model.fit(x_train, y_train, validation_data = (x_test, y_test), batch_size = 16)
14 pred = model.predict(x_test, batch_size = 8)
15 err = mean_absolute_error(y_test, pred)
ResourceExhaustedError: Graph execution error:
....
....
....
Node: 'gradient_tape/sequential/xception/block14_sepconv2/separable_conv2d/Conv2DBackpropFilter'
OOM when allocating tensor with shape[2048,1536,1,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node gradient_tape/sequential/xception/block14_sepconv2/separable_conv2d/Conv2DBackpropFilter}}]]
The memory allocation looks like this from the prompt (hopefully this makes sense):
total_region_allocated_bytes_: 5769199616
memory_limit_: 5769199616
available bytes: 0
curr_region_allocation_bytes_: 8589934592
Stats:
Limit: 5769199616
InUse: 5762760448
MaxInUse: 5769190400
NumAllocs: 192519
MaxAllocSize: 2470510592
Reserved: 0
PeakReserved: 0
LargestFreeBlock: 0
Is it because my GPU cannot take the batch_size?
UPDATE 2
I have decreased the batch_size
of the train_gen
to 32. Took out the batch_size
from the fit()
and predict()
method. Is this the right way to determine the MAE for each data split?
Code:
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 32,
shuffle = True)
X_train, Y_train = next(train_gen)
...
...
...
mae_list = []
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(X_train, Y_train) # X_train, Y_train = next(train_gen) from above
for train, test in split:
x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
history = model.fit(x_train, y_train, validation_data = (x_test, y_test))
pred = model.predict(x_test)
err = mean_absolute_error(y_test, pred)
mae_list.append(err)
UPDATE 3
According to the suggestions from the comments:
batch_size
of the train_gen
to 64.valid_gen
to use X_valid
and y_valid
as validation data of the fit()
method.x_test
for the predict()
method.Code:
# Checking the GPU availability
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
...
...
...
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 64,
shuffle = True)
X_train, Y_train = next(train_gen)
valid_gen = flow_from_dataframe(core_valid, valid_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 64,
shuffle = True)
X_valid, y_valid = next(valid_gen)
# Getting MAE for each data split using 5-fold (KFold)
cv_mae = []
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(X_train, Y_train)
for train, test in split:
x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
history = model.fit(x_train, y_train, validation_data = (X_valid, y_valid))
pred = model.predict(x_test)
err = mean_absolute_error(y_test, pred)
cv_mae.append(err)
cv_mae
The output:
2/2 [==============================] - 8s 2s/step - loss: 3.6179 - mae_months: 66.8136 - val_loss: 2.1544 - val_mae_months: 47.2171
2/2 [==============================] - 1s 394ms/step - loss: 1.0826 - mae_months: 36.3370 - val_loss: 1.6431 - val_mae_months: 40.9770
2/2 [==============================] - 1s 344ms/step - loss: 0.6129 - mae_months: 23.0258 - val_loss: 1.8911 - val_mae_months: 45.6456
2/2 [==============================] - 1s 360ms/step - loss: 0.4500 - mae_months: 22.6450 - val_loss: 1.3592 - val_mae_months: 36.7073
2/2 [==============================] - 1s 1s/step - loss: 0.4222 - mae_months: 20.2543 - val_loss: 1.1010 - val_mae_months: 32.8488
[<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([1.4442804, 1.3981661, 1.5037801, 2.2199252, 1.7645894, 1.4836203,
1.7916738, 1.3967942, 1.4069557, 2.516875 , 1.4077926, 1.4342965,
1.9279695], dtype=float32)>,
<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([1.8153722, 1.9236553, 1.3917867, 1.5313213, 1.387209 , 1.3831038,
1.4519565, 1.4680854, 1.7810788, 2.5733376, 1.4269204, 1.3751 ,
1.446231 ], dtype=float32)>,
<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([1.6616 , 1.6529323, 1.9181525, 2.536807 , 1.6306267, 2.856683 ,
2.113724 , 1.5543866, 1.9128528, 3.218016 , 1.4112593, 1.4043481,
3.229338 ], dtype=float32)>,
<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([2.1295295, 1.8527019, 1.9779519, 3.1390932, 1.5525225, 2.0811615,
1.6279813, 1.87973 , 1.5029857, 1.6502519, 2.3677726, 1.8570358,
1.7251074], dtype=float32)>,
<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([1.3926607, 1.7088655, 1.7379242, 3.5756006, 1.5988973, 1.3926607,
1.4928951, 1.4665956, 1.3926607, 1.4575896, 3.146022 , 1.3926607],
dtype=float32)>]
Does this mean that I have MAEs for 5 data splits? (where it says numpy = array[....]
in the output?)
Ideally, you'd split train and test sets together from the kfold split, but it doesn't matter if you use the same seed. kfold split just returns indices to select train and test elements. So you need to get those indices from the split from the original dataset.
Answer based on OP comment and question:
from sklearn.model_selection import StratifiedKFold as kfold
x, y = # images, labels
cvscores = []
kf = kfold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(x, y)
for train, test in split
x_train, x_test, y_train, y_test = x[train], x[test], y[train], y[test]
model = # do model stuff
_ = model.fit()
result = mode.evaluate()
#depending on how you want to handle the results
cvscores.append(result)
# do stuff with cvscores
I'm not sure if that would work with an object from flow_from
dataframe()` because that wouldn't be an array or array-like, although you should be able to get the arrays within.