How to calculate F1 Macro in Keras?

since Keras 2.0 metrics f1, precision, and recall have been removed. The solution is to use a custom metric function:

from keras import backend as K

def f1(y_true, y_pred):
    def recall(y_true, y_pred):
        """Recall metric.

        Only computes a batch-wise average of recall.

        Computes the recall, a metric for multi-label classification of
        how many relevant items are selected.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
        return recall

    def precision(y_true, y_pred):
        """Precision metric.

        Only computes a batch-wise average of precision.

        Computes the precision, a metric for multi-label classification of
        how many selected items are relevant.
        """
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision
    precision = precision(y_true, y_pred)
    recall = recall(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))


model.compile(loss='binary_crossentropy',
          optimizer= "adam",
          metrics=[f1])

The return line of this function

return 2*((precision*recall)/(precision+recall+K.epsilon()))

was modified by adding the constant epsilon, in order to avoid division by 0. Thus NaN will not be computed.


Using a Keras metric function is not the right way to calculate F1 or AUC or something like that.

The reason for this is that the metric function is called at each batch step at validation. That way the Keras system calculates an average on the batch results. And that is not the right F1 score.

Thats the reason why F1 score got removed from the metric functions in keras. See here:

  • https://github.com/keras-team/keras/commit/a56b1a55182acf061b1eb2e2c86b48193a0e88f7
  • https://github.com/keras-team/keras/issues/5794

The right way to do this is to use a custom callback function in a way like this:

  • https://github.com/PhilipMay/mltb#module-keras
  • https://medium.com/@thongonary/how-to-compute-f1-score-for-each-epoch-in-keras-a1acd17715a2

Tags:

Keras