Search code examples
pythonkerasdeep-learningconv-neural-networktensorflow2.0

How to solve CNN model fitting problem in tensorflow 2.2.0?


I want to train a CNN model with image data. I have 2 classes (mask and without mask). I import and save data the following code:

data_path='/train/'
categories=os.listdir(data_path)
labels=[i for i in range(len(categories))]
label_dict=dict(zip(categories,labels))
data=[]
target=[]
for category in categories:
    folder_path=os.path.join(data_path,category)
    img_names=os.listdir(folder_path)
    for img_name in img_names:
        img_path=os.path.join(folder_path,img_name)
        img=cv2.imread(img_path)
        try:
            gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) 
            resized=cv2.resize(gray,(500, 500))#dataset
            data.append(resized)
            target.append(label_dict[category])
        except Exception as e:
            print('Exception:',e)
data=np.array(data)/255.0
data=np.reshape(data,(data.shape[0],500, 500,1))
target=np.array(target)
new_target=np_utils.to_categorical(target)
#np.save('data',data)
#np.save('target',new_target)

and I build the model like this:

model=tf.keras.models.Sequential([
    Conv2D(32, 1, activation='relu', input_shape=(500, 500, 1)),
    MaxPooling2D(2,2),
    Conv2D(64, 1, activation='relu'),
    MaxPooling2D(2,2),
    Conv2D(128, 1, padding='same', activation='relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dropout(0.5), 
    Dense(256, activation='relu'),
    Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes 
])
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

model.summary give me following results:

________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 500, 500, 32)      64        
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 250, 250, 32)      0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 250, 250, 64)      2112      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 125, 125, 64)      0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 125, 125, 128)     8320      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 62, 62, 128)       0         
_________________________________________________________________
flatten (Flatten)            (None, 492032)            0         
_________________________________________________________________
dropout (Dropout)            (None, 492032)            0         
_________________________________________________________________
dense (Dense)                (None, 256)               125960448 
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 514       
=================================================================
Total params: 125,971,458
Trainable params: 125,971,458
Non-trainable params: 0

and then I fit the model but kernel stop. My fitting code is:

history=model.fit(data, target, epochs=10, batch_size=128, validation_data=data_val)

my tensorflow version is 2.2.0. Why doesn't run my model?


Solution

  • It seems your kernel is dying (being killed) as the thread is taking too many resources. Seems you are making an unnecessary complex model by adding too many connections and trainable parameters. In fact, the single dense layer in fact is responsible for 99.991% of all your trainable parameters (125,960,448 / 125,971,458).

    The issue is you are running out of computation resources (primarily RAM). Just to give you a context, following are some of the most influential CNN based architectures, most of which have been trained for DAYS on power GPUs.

    LeNet-5 - 60,000 parameters
    AlexNet - 60M paramters
    VGG-16 - 138M paramters
    Inception-v1 - 5M parameters
    Inception-v3 - 24M parameters
    ResNet-50 - 26M parameters
    Xception - 23M parameters
    Inception-v4 - 43M parameters
    Inception-ResNet-V2 - 56M parameters
    ResNeXt-50 - 25M parameters
    
    Your basic 2 CNN stack model - 125M parameters!
    

    Here is what you can do -

    flatten (Flatten)            (None, 492032)            0         
    _________________________________________________________________
    dropout (Dropout)            (None, 492032)            0         
    _________________________________________________________________
    dense (Dense)                (None, 256)               125960448 <---!!!!
    _________________________________________________________________
    

    You are flattening a 62x62x128 tensor to 492,000 length vector! Instead either try adding more CNN to bring the first 2 dims of the more manageable AND/OR increase the size of kernel in previous CNNs.

    The goal here is to have a manageable sized tensor before you hit the Dense layer. Also, try reducing the number of nodes in dense layer drastically.

    Try something like this for starters, something that your device can actually handle without killing the kernel, say with 68k parameters (you should go simpler though and increase complexity later.)

    model=tf.keras.models.Sequential([
        Conv2D(32, 3, activation='relu', input_shape=(500, 500, 1)),
        MaxPooling2D(3,3),
        Conv2D(64, 3, activation='relu'),
        MaxPooling2D(3,3),
        Conv2D(128, 3, padding='same', activation='relu'),
        MaxPooling2D(3,3),
        Conv2D(256, 3, padding='same', activation='relu'),
        MaxPooling2D(3,3),
        Flatten(),
        Dropout(0.5), 
        Dense(32, activation='relu'),
        Dense(2, activation='softmax') # dense layer has a shape of 2 as we have only 2 classes 
    ])
    
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    conv2d_19 (Conv2D)           (None, 498, 498, 32)      320       
    _________________________________________________________________
    max_pooling2d_18 (MaxPooling (None, 166, 166, 32)      0         
    _________________________________________________________________
    conv2d_20 (Conv2D)           (None, 164, 164, 64)      18496     
    _________________________________________________________________
    max_pooling2d_19 (MaxPooling (None, 54, 54, 64)        0         
    _________________________________________________________________
    conv2d_21 (Conv2D)           (None, 54, 54, 128)       73856     
    _________________________________________________________________
    max_pooling2d_20 (MaxPooling (None, 18, 18, 128)       0         
    _________________________________________________________________
    conv2d_22 (Conv2D)           (None, 18, 18, 256)       295168    
    _________________________________________________________________
    max_pooling2d_21 (MaxPooling (None, 6, 6, 256)         0         
    _________________________________________________________________
    flatten_5 (Flatten)          (None, 9216)              0         
    _________________________________________________________________
    dropout_5 (Dropout)          (None, 9216)              0         
    _________________________________________________________________
    dense_10 (Dense)             (None, 32)                294944    
    _________________________________________________________________
    dense_11 (Dense)             (None, 2)                 66        
    =================================================================
    Total params: 682,850
    Trainable params: 682,850
    Non-trainable params: 0
    _________________________________________________________________