Search code examples
pythonkerasscikit-learndeep-learningneural-network

How to calculate loss with KerasClassifier


I'm using KerasClassifier from sklearn to wrap my Keras model in order to perform K-fold cross validation.

model = KerasClassifier(build_fn=create_model, epochs=20, batch_size=8, verbose = 1)    
kfold = KFold(n_splits=10)
scoring = ['accuracy', 'precision', 'recall', 'f1']
results = cross_validate(estimator=model,
                               X=x_train,
                               y=y_train,
                               cv=kfold,
                               scoring=scoring,
                               return_train_score=True,
                              return_estimator=True)

Then I choose the best model between the 10 estimators returned by the function, according to metrics:

best_model = results['estimators'][2] #for example the second model

Now, I want to perform a predict on x_test and get accuracy and loss metrics. How can I do it? I tried model.evaluate(x_test, y_test) but the model is a KerasClassifier so I get an error.


Solution

  • Point is that your KerasClassifier instance mimics standard scikit-learn classifiers. In other terms, it is kind of a scikit-learn beast and, as is, it does not provide method .evaluate().

    Therefore, you might just call best_model.score(X_test, y_test) which will automatically return the accuracy as standard sklearn classifiers do. On the other hand, you can access the loss values obtained during training via the history_ attribute of your KerasClassifier instance.

    Here's an example:

    !pip install scikeras    
    
    from sklearn.datasets import make_classification
    from sklearn.model_selection import train_test_split, cross_validate, KFold
    import tensorflow as tf
    import tensorflow.keras
    from tensorflow.keras.layers import Dense
    from tensorflow.keras.models import Sequential
    from scikeras.wrappers import KerasClassifier
    
    X, y = make_classification(n_samples=100, n_features=20, n_informative=5, random_state=42)
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    def build_nn():
        ann = Sequential()
        ann.add(Dense(20, input_dim=X_train.shape[1], activation='relu', name="Hidden_Layer_1"))
        ann.add(Dense(1, activation='sigmoid', name='Output_Layer'))
        ann.compile(loss='binary_crossentropy', optimizer= 'adam', metrics = 'accuracy')
    return ann
    
    keras_clf = KerasClassifier(model = build_nn, optimizer="adam", optimizer__learning_rate=0.001, epochs=100, verbose=0)
    
    kfold = KFold(n_splits=10)
    scoring = ['accuracy', 'precision', 'recall', 'f1']
    results = cross_validate(estimator=keras_clf, X=X_train, y=y_train, scoring=scoring, cv=kfold, return_train_score=True, return_estimator=True)
    
    best_model = results['estimator'][2]
    
    # accuracy
    best_model.score(X_test, y_test)
    
    # loss values
    best_model.history_['loss']
    

    Eventually observe that, when in doubt, you can call dir(object) to get the list of all properties and methods of the specified object (dir(best_model) in your case).