python keras scikit-learn deep-learning neural-network

How to calculate loss with KerasClassifier

I'm using KerasClassifier from sklearn to wrap my Keras model in order to perform K-fold cross validation.

model = KerasClassifier(build_fn=create_model, epochs=20, batch_size=8, verbose = 1)    
kfold = KFold(n_splits=10)
scoring = ['accuracy', 'precision', 'recall', 'f1']
results = cross_validate(estimator=model,
                               X=x_train,
                               y=y_train,
                               cv=kfold,
                               scoring=scoring,
                               return_train_score=True,
                              return_estimator=True)

Then I choose the best model between the 10 estimators returned by the function, according to metrics:

best_model = results['estimators'][2] #for example the second model

Now, I want to perform a predict on x_test and get accuracy and loss metrics. How can I do it? I tried model.evaluate(x_test, y_test) but the model is a KerasClassifier so I get an error.

Solution

Point is that your KerasClassifier instance mimics standard scikit-learn classifiers. In other terms, it is kind of a scikit-learn beast and, as is, it does not provide method .evaluate().

Therefore, you might just call best_model.score(X_test, y_test) which will automatically return the accuracy as standard sklearn classifiers do. On the other hand, you can access the loss values obtained during training via the history_ attribute of your KerasClassifier instance.

Here's an example:

!pip install scikeras    

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, cross_validate, KFold
import tensorflow as tf
import tensorflow.keras
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from scikeras.wrappers import KerasClassifier

X, y = make_classification(n_samples=100, n_features=20, n_informative=5, random_state=42)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

def build_nn():
    ann = Sequential()
    ann.add(Dense(20, input_dim=X_train.shape[1], activation='relu', name="Hidden_Layer_1"))
    ann.add(Dense(1, activation='sigmoid', name='Output_Layer'))
    ann.compile(loss='binary_crossentropy', optimizer= 'adam', metrics = 'accuracy')
return ann

keras_clf = KerasClassifier(model = build_nn, optimizer="adam", optimizer__learning_rate=0.001, epochs=100, verbose=0)

kfold = KFold(n_splits=10)
scoring = ['accuracy', 'precision', 'recall', 'f1']
results = cross_validate(estimator=keras_clf, X=X_train, y=y_train, scoring=scoring, cv=kfold, return_train_score=True, return_estimator=True)

best_model = results['estimator'][2]

# accuracy
best_model.score(X_test, y_test)

# loss values
best_model.history_['loss']

Eventually observe that, when in doubt, you can call dir(object) to get the list of all properties and methods of the specified object (dir(best_model) in your case).