Search code examples
pythonmachine-learningkerastraining-data

Can flow_from_directory get train and validation data from the same directory in Keras?


I got the following example from here.

train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        'data/train',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        'data/validation',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

There are two separate directories for train and validation. Just curious whether I can get train and validation data split from the same directory instead of two separate directories? Any example?


Solution

  • You can pass validation_split argument (a number between 0 and 1) to ImageDataGenerator class instance to split the data into train and validation sets:

    generator = ImagaDataGenerator(..., validation_split=0.3)
    

    And then pass subset argument to flow_from_directory to specify training and validation generators:

    train_gen = generator.flow_from_directory(dir_path, ..., subset='training')
    val_gen = generator.flow_from_directory(dir_path, ..., subset='validation')
    

    Note: If you have set augmentation parameters for the ImageDataGenerator, then by using this solution both training and validation images will be augmented.