Search code examples
tensorflowkerastensorflow2.0metrics

Metrics SparseTopKCategoricalAccuracy failed with MirroredStrategy in tf.keras


My compute platform has two GPUs and I use MirroredStrategy() to share the load between the two GPUs. I run into problems when I use SparseTopKCategoricalAccuracy as the metric. My code is as follows:

with mirrored_strategy.scope():

    model=Sequential()

    model.add(Dense(512, kernel_initializer= 'he_uniform', input_shape=(X_train.shape[1],),
                activity_regularizer=l1(1.e-3)))
    model.add(Activation('relu'))

    model.add(BatchNormalization(momentum=0.0)) 

    model.add(Dropout(0.2))
    model.add(Dense(4096, kernel_initializer= 'he_uniform', activation='relu',
                activity_regularizer=l1(1.e-4)))

    model.add(BatchNormalization(momentum=0.0))

    model.add(Dropout(0.3))

    model.add(Dense(512, kernel_initializer= 'he_uniform', activation='relu',
                activity_regularizer=l1(1.e-4)))

    model.add(BatchNormalization(momentum=0.0))

    model.add(Dropout(0.5))
    model.add(Dense(num_classes, activation='softmax'))


opt=Adam(learning_rate=0.001)
m=SparseTopKCategoricalAccuracy(k=10)
model.compile(optimizer=opt, loss='sparse_categorical_crossentropy', metrics=m)  

es = EarlyStopping(monitor='val_sparse_top_k_categorical_accuracy', patience=10, mode='max', verbose=0, restore_best_weights=True)
lr = LearningRateScheduler(lr_scheduler)enter code here

The error message I received is as follows:

ValueError: Metric (<tensorflow.python.keras.metrics.SparseTopKCategoricalAccuracy object at 0x7fee452d4f90>) passed to model.compile was created inside of a different distribution strategy scope than the model. All metrics must be created in the same distribution strategy scope as the model (in this case <tensorflow.python.distribute.mirrored_strategy.MirroredStrategy object at 0x7ff5cd1ff490>). If you pass in a string identifier for a metric to compile the metric will automatically be created in the correct distribution strategy scope.

If I get rid of "with mirrored_strateg.scope():", it works but with only one GPU. What do I need to fix to make it run on two GPUs?


Solution

  • The cause is in the error message:

    All metrics must be created in the same distribution strategy scope as the model

    Instantiate the metric within the mirrored strategy scope.

    with mirrored_strategy.scope():
        # Define model ...
        m = SparseTopKCategoricalAccuracy(k=10)
    
    model.compile(optimizer=opt, loss='sparse_categorical_crossentropy', metrics=m)
    
    es = EarlyStopping(monitor='val_sparse_top_k_categorical_accuracy', patience=10, mode='max', verbose=0, restore_best_weights=True)
    lr = LearningRateScheduler(lr_scheduler)