I am trying to use the DropBlock2D
layer from KerasCV, version 0.9.0. However, I am only able to compile the layer into a model if I specify run_eagerly=True
. This appears to be related to the fact that when forward pass from the non-eager run is attempted, a symbolic tensor is passed to the layer, which expects a concrete value instead. The Keras docs say that run_eagerly
should reserved for debugging, so why is it necessary that I enable it here?
Interactive example (Google Colab)
I set up the model using the functional approach:
input = keras.layers.Input(shape=input_shape)
x = keras.layers.Conv2D(32, (1, 1))(input)
x = keras.layers.BatchNormalization()(x)
x = keras.layers.ReLU()(x)
x = keras_cv.layers.DropBlock2D(rate=0.05, block_size=(14, 14))(x)
x = keras.layers.GlobalAveragePooling2D()(x)
output = keras.layers.Dense(num_classes)(x)
model = keras.Model(
inputs=input,
outputs=output,
)
Then I compile the model:
model.compile(
loss=keras.losses.SparseCategoricalCrossentropy(),
optimizer=keras.optimizers.Adam(learning_rate=1e-3),
# Uncommenting below will make things work
#run_eagerly=True
)
Then I load data for training:
mnist_train, mnist_test = keras.datasets.fashion_mnist.load_data()
mnist_x_train, mnist_y_train = mnist_train
model.fit(
mnist_x_train[0:20,:,:],
mnist_y_train[0:20],
epochs=2
)
It's after the model.fit
line where I get errors if run_eagerly
is not True
when I compile:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-28-15f407e6a023> in <cell line: 41>()
39 mnist_x_train, mnist_y_train = mnist_train
40
---> 41 model.fit(
42 mnist_x_train[0:20,:,:],
43 mnist_y_train[0:20],
1 frames
/usr/local/lib/python3.10/dist-packages/keras_cv/src/layers/regularization/dropblock_2d.py in call(self, x, training)
190 valid_block = ops.logical_and(
191 ops.logical_and(
--> 192 w_i >= int(dropblock_width // 2),
193 w_i < width - (dropblock_width - 1) // 2,
194 ),
TypeError: Exception encountered when calling DropBlock2D.call().
int() argument must be a string, a bytes-like object or a real number, not 'SymbolicTensor'
Arguments received by DropBlock2D.call():
• x=tf.Tensor(shape=(None, 28, 28, 32), dtype=float32)
• training=True
I understand that the int
cast won't work on a symbolic tensor, but I am wondering if anything can be done about this short of filing a bug with KerasCV.
The most straightforward way without waiting for the upstream to fix the bug is to subclass the layer and to fix the call function where the casting takes place. Replacing the python int
casting with a call to ops.cast(x, dtype="int32")
:
from keras_cv.src.backend import ops
from keras_cv.src.backend import random
class PatchedDropBlock2D(keras_cv.layers.DropBlock2D):
def call(self, x, training=None):
if not training or self._rate == 0.0:
return x
_, height, width, _ = ops.split(ops.shape(x), 4)
# Unnest scalar values
height = ops.squeeze(height)
width = ops.squeeze(width)
dropblock_height = ops.minimum(self._dropblock_height, height)
dropblock_width = ops.minimum(self._dropblock_width, width)
gamma = (
self._rate
* ops.cast(width * height, dtype="float32")
/ ops.cast(dropblock_height * dropblock_width, dtype="float32")
/ ops.cast(
(width - self._dropblock_width + 1)
* (height - self._dropblock_height + 1),
"float32",
)
)
# Forces the block to be inside the feature map.
w_i, h_i = ops.meshgrid(ops.arange(width), ops.arange(height))
valid_block = ops.logical_and(
ops.logical_and(
w_i >= ops.cast(dropblock_width // 2, dtype="int32"),
w_i < width - (dropblock_width - 1) // 2,
),
ops.logical_and(
h_i >= ops.cast(dropblock_height // 2, dtype="int32"),
h_i < width - (dropblock_height - 1) // 2,
),
)
valid_block = ops.reshape(valid_block, [1, height, width, 1])
random_noise = random.uniform(
ops.shape(x), seed=self._random_generator, dtype="float32"
)
valid_block = ops.cast(valid_block, dtype="float32")
seed_keep_rate = ops.cast(1 - gamma, dtype="float32")
block_pattern = (1 - valid_block + seed_keep_rate + random_noise) >= 1
block_pattern = ops.cast(block_pattern, dtype="float32")
window_size = [1, self._dropblock_height, self._dropblock_width, 1]
# Double negative and max_pool is essentially min_pooling
block_pattern = -ops.max_pool(
-block_pattern,
pool_size=window_size,
strides=[1, 1, 1, 1],
padding="SAME",
)
# Slightly scale the values, to account for magnitude change
percent_ones = ops.cast(ops.sum(block_pattern), "float32") / ops.cast(
ops.size(block_pattern), "float32"
)
return (
x
/ ops.cast(percent_ones, x.dtype)
* ops.cast(block_pattern, x.dtype)
)