Keras + CNTK: TensorSliceWithMBLayoutFor

I am running into a few problems while migrating an image segmentation code done with Keras+Tensorflow backend into Keras+CNTK backend. The code runs perfectly with a TF backend but crashes with CNTK.

The model was inspired from https://github.com/jocicmarko/ultrasound-nerve-segmentation/blob/master/train.py

Model inputs are defined as inputs = Input((img_width, img_height, num_channels)), where num_channels = 1.

The error comes from the line trying to fit the model: model.fit(X_train, Y_train, epochs=trainingEpochs, verbose=2, shuffle=True, validation_data=(X_val, Y_val), callbacks=cb_list)

Where X_train, Y_train, X_val, Y_val are all of shape (num_slices, img_width, img_height, num_channels)

The error I keep getting is the following:

Traceback (most recent call last):
File "TrainNetwork_CNTK.py", line 188, in
history = model.fit(X_train, Y_train, epochs=trainingEpochs, verbose=2, shuffle=True, validation_data=(X_val, Y_val), callbacks=cb_list)
File "C:\Users...\site-packages\keras\engine\training.py", line 1430, in fit
initial_epoch=initial_epoch)
File "C:\Users...\site-packages\keras\engine\training.py", line 1079, in _fit_loop
outs = f(ins_batch)
File "C:\Users...\site-packages\keras\backend\cntk_backend.py", line 1664, in call
input_dict, self.trainer_output)
File "C:\Users...\site-packages\cntk\train\trainer.py", line 160, in train_minibatch
output_map, device)
File "C:\Users...\site-packages\cntk\cntk_py.py", line 2769, in train_minibatch
return _cntk_py.Trainer_train_minibatch(self, *args)
RuntimeError: Node 'UserDefinedFunction2738' (UserDefinedV2Function operation): TensorSliceWithMBLayoutFor: FrameRange's dynamic axis is inconsistent with data:

There seems to be very little activity on CNTK issues here in SO, so anything to try to shine some light to this issue would be very helpful!

Solution

The reason is the loss function:

def dice_coef(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

There is a known issue for cntk_keras' flatten implementation, which cause the batch axis shape not matching for this case. Unfortunately, I haven't got a chance to fix it :(

But for your case, I think we don't need the flatten here, right? As you are using K.sum(x) with default axis option, which will apply reduce sum to all axis to get a scale, we should get the same result without flatten it. I tried the loss function below and it seems works:

def dice_coef(y_true, y_pred):
    intersection = K.sum(y_true * y_pred)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)