Search code examples
pythontensorflowkerasimage-classificationauto-keras

RAM Overflow Colab, when running model.fit() in Image Classifier of AutoKeras for many images


I'm trying to create an Image Classifier on a dataset with 40'000 images, in order to let Autokeras train the most appropriate model for me afterwards. Now the problem is, that every time I load all the images and get their labels but when I run the normalization Google Colab, there is a RAM overflow (although having a Pro+ account). Subsequently my code:

# Import TensorFlow
%tensorflow_version 2.x
import tensorflow as tf

import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import normalize, to_categorical
!pip install autokeras
import autokeras as ak

import glob
import numpy as np

images = glob.glob(path + '/*.png')

import random

data = []
labels = []

for i in images:
    image=tf.keras.preprocessing.image.load_img(i, color_mode='rgb')
    image=np.array(image, dtype ='float32')
    image=cv2.resize(image, (180, 180))
    image/=255.0
    data.append(image)
    label=os.path.basename(str(i.replace('.png', '')))
    label=label.split()[0]
    labels.append(label)

data = np.array(data)
labels = np.array(labels)
print(labels)

Up until here everything works like a charm, but then I face the problem which creates the overflow:

# normalize feature and encode label
X = data 
y = np.zeros(labels.shape)
indices = np.unique(labels)
for i in range(labels.shape[0]):
  y[i] = np.where(labels[i] == indices)[0]
y = to_categorical(y)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
                                                random_state=42) # this line seems to be the problem where Colab has a RAM overflow

clf= ak.ImageClassifier(overwrite=True, max_trials=20)
clf.fit(X_train, y_train, epochs=10) # and also this line seems to be the problem where Colab has a RAM overflow.

Does anyone know what the problem might be? A hint in any direction would be highly appreciated, as this makes me going crazy!


Solution

  • I highly recommend you tf.data.Dataset for creating the dataset:

    1. Do all processes (like resize and normalize) that you want on all images with dataset.map.
    2. Instead of using train_test_split, Use dataset.take, datast.skip for splitting dataset.

    Code for generating random images and label:

    # !pip install autokeras
    import tensorflow as tf
    import autokeras as ak
    import numpy as np
    
    data = np.random.randint(0, 255, (45_000,32,32,3))
    label = np.random.randint(0, 10, 45_000)
    label = tf.keras.utils.to_categorical(label) 
    

    Convert data & label to tf.data.Dataset and process on them: (only 55 ms for 45_000, benchmark on colab)

    dataset = tf.data.Dataset.from_tensor_slices((data, label))
    def resize_normalize_preprocess(image, label):
        image = tf.image.resize(image, (16, 16))
        image = image / 255.0
        return image, label
    
    # %%timeit 
    dataset = dataset.map(resize_normalize_preprocess, num_parallel_calls=tf.data.AUTOTUNE)
    # 1 loop, best of 5: 54.9 ms per loop
    
    dataet_size = len(dataset)
    train_size = int(0.8 * dataet_size)
    test_size = int(0.2 * len(dataset))
    
    dataset = dataset.shuffle(32)
    train_dataset = dataset.take(train_size)
    test_dataset = dataset.skip(train_size)
    
    print(f'Size dataset : {len(dataset)}')
    print(f'Size train_dataset : {len(train_dataset)}')
    print(f'Size test_dataset : {len(test_dataset)}')
    
    clf = ak.ImageClassifier(overwrite=True, max_trials=1)
    clf.fit(train_dataset, epochs=1)
    print(clf.evaluate(test_dataset))
    

    Output:

    Size dataset : 45000
    Size train_dataset : 36000
    Size test_dataset : 9000
    
    Search: Running Trial #1
    
    Value             |Best Value So Far |Hyperparameter
    vanilla           |?                 |image_block_1/block_type
    True              |?                 |image_block_1/normalize
    False             |?                 |image_block_1/augment
    3                 |?                 |image_block_1/conv_block_1/kernel_size
    1                 |?                 |image_block_1/conv_block_1/num_blocks
    2                 |?                 |image_block_1/conv_block_1/num_layers
    True              |?                 |image_block_1/conv_block_1/max_pooling
    False             |?                 |image_block_1/conv_block_1/separable
    0.25              |?                 |image_block_1/conv_block_1/dropout
    32                |?                 |image_block_1/conv_block_1/filters_0_0
    64                |?                 |image_block_1/conv_block_1/filters_0_1
    flatten           |?                 |classification_head_1/spatial_reduction_1/reduction_type
    0.5               |?                 |classification_head_1/dropout
    adam              |?                 |optimizer
    0.001             |?                 |learning_rate
    

    Result of searching and finding the best parameter and training:

    Trial 1 Complete [00h 01m 16s]
    val_loss: 2.3030436038970947
    
    Best val_loss So Far: 2.3030436038970947
    Total elapsed time: 00h 01m 16s
    INFO:tensorflow:Oracle triggered exit
    1125/1125 [==============================] - 68s 60ms/step - loss: 2.3072 - accuracy: 0.0979
    INFO:tensorflow:Assets written to: ./image_classifier/best_model/assets
    282/282 [==============================] - 26s 57ms/step - loss: 2.3025 - accuracy: 0.0970
    [2.302501916885376, 0.09700000286102295]