I have this data in which I specify the batch_size
as 32:
# Preparing and preprocessing the data
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_dir = '/content/pizza_steak/train'
test_dir = '/content/pizza_steak/test'
train_data_gen_aug = ImageDataGenerator(rotation_range=0.2,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip=True,
rescale=1./255)
test_data_gen = ImageDataGenerator(rescale=1./255)
train_data_aug = train_data_gen_aug.flow_from_directory(train_dir,
target_size=(224, 224),
class_mode='binary',
batch_size=32,
seed=42)
test_data = test_data_gen.flow_from_directory(test_dir,
target_size=(224, 224),
class_mode='binary',
batch_size=32,
seed=42)
which returns:
Found 1500 images belonging to 2 classes.
Found 500 images belonging to 2 classes
and as I explore it as below:
# Explore the data
train_images, train_labels = train_data.next()
train_images_aug, train_labels_aug = train_data_aug.next()
test_images_aug, test_labels_aug = test_data.next()
print('train_data: ', len(train_data), train_images.shape, train_labels.shape)
print('train_data_aug: ', len(train_data_aug), train_images_aug.shape, train_labels_aug.shape)
print('test_data: ', len(test_data), test_images_aug.shape, test_labels_aug.shape)
it returns:
train_data: 47 (32, 224, 224, 3) (32,)
train_data_aug: 47 (32, 224, 224, 3) (32,)
test_data: 16 (32, 224, 224, 3) (32,)
I then build and compile the model and I specify bacth_size
as None
in the InputLayer
:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import InputLayer, Conv2D, MaxPool2D, Flatten, Dense
from tensorflow.keras.activations import relu, sigmoid
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import BinaryCrossentropy
from tensorflow.keras.metrics import binary_accuracy
# create the model
model = Sequential()
# Add the input layer
INPUT_SHAPE = (224, 224, 3)
model.add(InputLayer(input_shape=INPUT_SHAPE,
batch_size=None, # I enetered the batch size here as None
))
# Add the hidden layers
model.add(Conv2D(filters=10,
kernel_size=3,
strides=1,
padding='valid',
activation=relu))
model.add(MaxPool2D(pool_size=(2, 2), strides=None, padding='valid'))
# Add the flatten layer
model.add(Flatten())
# Add the output layer
model.add(Dense(units=1, activation=sigmoid))
# Compile the model
model.compile(optimizer=Adam(),
loss=BinaryCrossentropy(),
metrics=[binary_accuracy])
And then fit the model and I specify batch_size as None
history = model.fit(train_data_aug,
batch_size=None, # bacth_size defined as None
epochs=5,
verbose=1,
validation_data=test_data,
steps_per_epoch=len(train_data_aug),
validation_steps=len(test_data))
The model works fine and when trained on only 5 epochs, has a val_binary_accuracy
of 81 percent.
In what situations the batch size in the other two should be used and is possible to define the batch_size
in all of them or it may cause a problem?
Batch Size is the number of samples per gradient update. If it is unspecified like you have in your model.fit() it defaults to 32. However, your data is in the form of a generator which already has batches. So you don't have to specify a batch size.
From Tensorflow Docs: https://www.tensorflow.org/api_docs/python/tf/keras/Model
Do not specify the batch_size if your data is in the form of datasets, generators, or
keras.utils.Sequence instances (since they generate batches).