Search code examples
tensorflowkerasclassificationmetricsmulticlass-classification

Tensorflow Macro F1 Score for multiclass and also for binary classification


I am trying to train 2 1D Conv neural networks - one for a multiclass classification problem and second for a binary classification problem. One of my metrics has to be Macro F1 score for both problems. However I am having a problem using tfa.metrics.F1Score from tensorflow addons.

Multiclass classification

I have 3 classes encoded as 0, 1, 2.

The last layer of the network and the compile method look like this (int_sequeces_input is the input layer):

preds = layers.Dense(3, activation="softmax")(x)
model = keras.Model(int_sequences_input, preds)


f1_macro = F1Score(num_classes=3, average='macro')
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy',f1_macro])

However when I run model.fit(), I get the following error:

ValueError: Dimension 0 in both shapes must be equal, but are 3 and 1. Shapes are [3] and [1]. for '{{node AssignAddVariableOp_7}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_7/resource, Sum_6)' with input shapes: [], [1].

shapes of data:

X_train - (23658, 150)

y_train - (23658,)

Binary classification

I have 2 classes encoded as 0,1

The last layer of the network and the compile method look like this (int_sequeces_input is the input layer):

preds = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(int_sequences_input, preds)

print(model.summary())

f1_macro = F1Score(num_classes=2, average='macro')
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy',f1_macro])

Again, when I run model.fit() I get error:

ValueError: Dimension 0 in both shapes must be equal, but are 2 and 1. Shapes are [2] and [1]. for '{{node AssignAddVariableOp_4}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_4/resource, Sum_3)' with input shapes: [], [1].

shapes of data:

X_train - (15770, 150)

y_train - (15770,)

So my question is: how to evaluate both of my models using macro F1 score? How can I fix my implementation to make it work with tfa.metrics.F1Score? Or is there any other way to calculate macro F1 score without using tfa.metrics.F1Score? Thanks.


Solution

  • Have a look at the usage example from its doc page.

    metric = tfa.metrics.F1Score(num_classes=3, threshold=0.5)
    y_true = np.array([[1, 1, 1],
                       [1, 0, 0],
                       [1, 1, 0]], np.int32)
    y_pred = np.array([[0.2, 0.6, 0.7],
                       [0.2, 0.6, 0.6],
                       [0.6, 0.8, 0.0]], np.float32)
    metric.update_state(y_true, y_pred)
    

    You can see that it expects the label to be in one-hot format.

    But given the shapes you mentioned above:

    shapes of data:
    X_train - (23658, 150)
    
    y_train - (23658,)
    

    It looks like your labels are in index format. Try converting them to one hot with tf.one_hot(y_train, num_classes). You'll also need to change your loss to loss='categorical_crossentropy'.