Search code examples
scikit-learnkerasprecisionprecision-recall

Keras - finding recall values of binary classification


I am little bit in trouble finding the value of recall using sklearn. I am using keras 2.0 for solving a binary classification problem. For finding recall I need to depend of sklearn metrics, but I am getting a value error for. Here is my sample code and the error stack:

>> print(y_true[:5])
>> [[ 0.  1.]
   [ 0.  1.]
   [ 1.  0.]
   [ 0.  1.]
   [ 1.  0.]]

>> y_scores = model.predict(x_val)
>> print(y_scores[:5])
[[  7.00690389e-01   2.99309582e-01]
 [  9.36253404e-04   9.99063790e-01]
 [  9.99530196e-01   4.69864986e-04]
 [  6.66563153e-01   3.33436847e-01]
 [  9.98917222e-01   1.08276575e-03]

 >>from sklearn import metrics
 >>recall_score(y_true, y_scores)

 ValueError                                Traceback (most recent call last)
 <ipython-input-39-9f93c0c66265> in <module>()
  1 from sklearn import metrics
 ----> 2 recall_score(y_true, y_scores)
 ~\AppData\Local\Continuum\miniconda3\envs\deepnetwork\lib\site-
 packages\sklearn\metrics\classification.py in recall_score(y_true, y_pred, 
 labels, pos_label, average, sample_weight)
 1357                                                  average=average,
 1358                                                  warn_for=('recall',),
 -> 1359                                                  
 sample_weight=sample_weight)
  1360     return r
  1361 

  ~\AppData\Local\Continuum\miniconda3\envs\deepnetwork\lib\site-
  packages\sklearn\metrics\classification.py in 
  precision_recall_fscore_support(y_true, y_pred, beta, labels, pos_label, 
 average, warn_for, sample_weight)
 1023         raise ValueError("beta should be >0 in the F-beta score")
 1024 
 -> 1025     y_type, y_true, y_pred = _check_targets(y_true, y_pred)
 1026     present_labels = unique_labels(y_true, y_pred)
 1027 

   ~\AppData\Local\Continuum\miniconda3\envs\deepnetwork\lib\site-
    packages\sklearn\metrics\classification.py in _check_targets(y_true, 
   y_pred)
 79     if len(y_type) > 1:
 80         raise ValueError("Classification metrics can't handle a mix of 
    {0} "
  ---> 81                "and {1} targets".format(type_true, type_pred))
 82 
 83     # We can't have more than one value on y_type => The set is no more 
       needed

 ValueError: Classification metrics can't handle a mix of multilabel-
 indicator  and continuous-multioutput targets

Solution

  • Your y_true and y_scores are one-hot encoded, while, according to the docs, recall_score requires them to be 1-D arrays:

    from sklearn.metrics import recall_score
    import numpy as np
    
    y_true = [[ 0. , 1.],
              [ 0. , 1.],
              [ 1. , 0.],
              [ 0. , 1.],
              [ 1. , 0.]]
    
    y_scores = [[  7.00690389e-01 ,  2.99309582e-01],
                [  9.36253404e-04 ,  9.99063790e-01],
                [  9.99530196e-01 ,  4.69864986e-04],
                [  6.66563153e-01  , 3.33436847e-01],
                [  9.98917222e-01 ,  1.08276575e-03]]
    
    yy_true = [np.argmax(i) for i in y_true]
    yy_true
    # [1, 1, 0, 1, 0]
    
    yy_scores = [np.argmax(i) for i in y_scores]
    yy_scores
    # [0, 1, 0, 0, 0]
    
    recall_score(yy_true, yy_scores)
    # 0.33333333333333331