python tensorflow keras deep-learning batchsize

Where to define the batch size in tensorflow computer vision modeling?

I have this data in which I specify the batch_size as 32:

# Preparing and preprocessing the data
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_dir = '/content/pizza_steak/train'
test_dir = '/content/pizza_steak/test'

train_data_gen_aug = ImageDataGenerator(rotation_range=0.2, 
                                        width_shift_range=0.2, 
                                        height_shift_range=0.2, 
                                        shear_range=0.2, 
                                        zoom_range=0.2,
                                        horizontal_flip=True, 
                                        vertical_flip=True, 
                                        rescale=1./255)
test_data_gen = ImageDataGenerator(rescale=1./255)

train_data_aug = train_data_gen_aug.flow_from_directory(train_dir, 
                                                        target_size=(224, 224), 
                                                        class_mode='binary', 
                                                        batch_size=32, 
                                                        seed=42)
test_data = test_data_gen.flow_from_directory(test_dir, 
                                              target_size=(224, 224), 
                                              class_mode='binary', 
                                              batch_size=32, 
                                              seed=42)

which returns:

Found 1500 images belonging to 2 classes.
Found 500 images belonging to 2 classes

and as I explore it as below:

# Explore the data
train_images, train_labels = train_data.next()
train_images_aug, train_labels_aug = train_data_aug.next()
test_images_aug, test_labels_aug = test_data.next()
print('train_data:     ', len(train_data), train_images.shape, train_labels.shape)
print('train_data_aug: ', len(train_data_aug), train_images_aug.shape, train_labels_aug.shape)
print('test_data:      ', len(test_data), test_images_aug.shape, test_labels_aug.shape)

it returns:

train_data:      47 (32, 224, 224, 3) (32,)
train_data_aug:  47 (32, 224, 224, 3) (32,)
test_data:       16 (32, 224, 224, 3) (32,)

I then build and compile the model and I specify bacth_size as None in the InputLayer:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import InputLayer, Conv2D, MaxPool2D, Flatten, Dense
from tensorflow.keras.activations import relu, sigmoid
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import BinaryCrossentropy
from tensorflow.keras.metrics import binary_accuracy

# create the model
model = Sequential()

# Add the input layer
INPUT_SHAPE = (224, 224, 3)
model.add(InputLayer(input_shape=INPUT_SHAPE, 
                       batch_size=None,  # I enetered the batch size here as None
                       ))

# Add the hidden layers
model.add(Conv2D(filters=10, 
                   kernel_size=3, 
                   strides=1, 
                   padding='valid',
                   activation=relu))
model.add(MaxPool2D(pool_size=(2, 2), strides=None, padding='valid'))

# Add the flatten layer
model.add(Flatten())

# Add the output layer
model.add(Dense(units=1, activation=sigmoid))

# Compile the model
model.compile(optimizer=Adam(), 
                loss=BinaryCrossentropy(),
                metrics=[binary_accuracy])

And then fit the model and I specify batch_size as None

history = model.fit(train_data_aug,
                    batch_size=None,  # bacth_size defined as None
                    epochs=5, 
                    verbose=1, 
                    validation_data=test_data, 
                    steps_per_epoch=len(train_data_aug), 
                    validation_steps=len(test_data))

The model works fine and when trained on only 5 epochs, has a val_binary_accuracy of 81 percent.

In what situations the batch size in the other two should be used and is possible to define the batch_size in all of them or it may cause a problem?

Solution

Batch Size is the number of samples per gradient update. If it is unspecified like you have in your model.fit() it defaults to 32. However, your data is in the form of a generator which already has batches. So you don't have to specify a batch size.

From Tensorflow Docs: https://www.tensorflow.org/api_docs/python/tf/keras/Model

Do not specify the batch_size if your data is in the form of datasets, generators, or 
keras.utils.Sequence instances (since they generate batches).