python machine-learning keras scikit-learn deep-learning

Issue setting up SciKeras model

I have an existing setup using scikit-learn, but am looking into expanding into deep learning with Keras. I am also using Dask, which recommends using SciKeras.

The way the SciKeras KerasClassifier is currently setup, seems to fit as expected (from the verbose output), but the model seems to have learned nothing at all. I have followed the SciKeras docs here, but I might have overlooked something.

With a Scikit-Learn RF Classifier the kappa score is about 0.44, with Keras it is about 0.55, and with SciKeras it is 0.0 (clearly an issue). In the 2. Following SciKeras docs to use Keras where is the implementation error that prevents a similar result compared to the one achieved using the 3. Exclusively using Keras below?

Below I have listed the current scikit-learn implementation with RF (as expected output), the output with SciKeras (as actual output), and the output using Keras exclusively (as expected result)

1. Current output using scikit-learn random forest:

def default_classifier():
    return RandomForestClassifier(oob_score=True, n_jobs=-1)

... ### Preprocessing stuff...

X_train, X_test, y_train, y_test = splits

# Define the Pipeline    
## Classification    
model = default_classifier()
model.fit(X_train, y_train)

## Evaluation Metrics
from sklearn.model_selection import cross_val_score
score = cross_val_score(model, X_test, y_test, scoring='accuracy', cv=5, n_jobs=-1, error_score='raise')
print('Mean: %.3f (Std: %.3f)' % (np.mean(score), np.std(score)))

# Verbose with results...
columns, report, true_matrix, pred_matrix = cl.classification_metrics(model, splits, score)

Respective sklearn output:

Test Size:  0.2
Split Shapes:   [(79997, 96), (20000, 96), (79997, 12), (20000, 12)]
Mean: 0.374 (Std: 0.006)
Overall: 0.510  Kappa: 0.441
Weighted F1-Score: 0.539

2. Following SciKeras docs to use Keras:

from tensorflow import keras
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import train_test_split
import numpy as np

def fcn_model(hidden_layer_dim, meta):
    # note that meta is a special argument that will be
    # handed a dict containing input metadata
    n_features_in_ = meta["n_features_in_"]
    X_shape_ = meta["X_shape_"]
    n_classes_ = meta["n_classes_"]
    
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(n_features_in_, input_shape=X_shape_[1:]))
    model.add(keras.layers.Activation("relu"))
    model.add(keras.layers.Dense(hidden_layer_dim))
    model.add(keras.layers.Activation("relu"))
    model.add(keras.layers.Dense(n_classes_))
    model.add(keras.layers.Activation("softmax"))
    return model

def get_model_fcn(modelargs={}):
    return KerasClassifier(fcn_model, 
                           hidden_layer_dim=128, 
                           epochs=10,
                           optimizer='adam',
                           loss='categorical_crossentropy',
                           metrics=['accuracy'],
                           fit__use_multiprocessing=True,
                           **modelargs)

... ### Preprocessing stuff...

X_train, X_test, y_train, y_test = splits

# Define the Pipeline    
## Classification    
model = get_model_fcn()
model.fit(X_train, y_train)

## Evaluation Metrics
from sklearn.model_selection import cross_val_score
score = cross_val_score(model, X_test, y_test, scoring='accuracy', cv=5, n_jobs=-1, error_score='raise')
print('Mean: %.3f (Std: %.3f)' % (np.mean(score), np.std(score)))

columns, report, true_matrix, pred_matrix = cl.classification_metrics(model, splits, score)

Respective scikeras output (result not very good):

Test Size:  0.2
Split Shapes:   [(79997, 96), (20000, 96), (79997, 12), (20000, 12)]
Epoch 1/10
2500/2500 [==============================] - 4s 1ms/step - loss: 1.6750 - accuracy: 0.3762
Epoch 2/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.3132 - accuracy: 0.5021
Epoch 3/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.2295 - accuracy: 0.5371
Epoch 4/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1651 - accuracy: 0.5599
Epoch 5/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1178 - accuracy: 0.5806
Epoch 6/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0889 - accuracy: 0.5935
Epoch 7/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0845 - accuracy: 0.5922
Epoch 8/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0548 - accuracy: 0.6043
Epoch 9/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0415 - accuracy: 0.6117
Epoch 10/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0316 - accuracy: 0.6172
Mean: 0.000 (Std: 0.000)
625/625 [==============================] - 0s 700us/step # Here it is running model.predict(X_test)
Overall: 0.130  Kappa: 0.000
Weighted F1-Score: 0.030

3. Exclusively using Keras:

# meta copies what SciKeras passes to the Keras model
meta = {
    #'classes_': array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]), 
    #'target_type_': 'multilabel-indicator', 
    'y_dtype_': np.dtype('uint8'), 
    'y_ndim_': 2, 
    'X_dtype_': np.dtype('float32'), 
    'X_shape_': (79997, 96), 
    'n_features_in_': 96, 
    #'target_encoder_': ClassifierLabelEncoder(loss='categorical_crossentropy'), 
    'n_classes_': 12, 
    'n_outputs_': 1, 
    'n_outputs_expected_': 1, 
    #'feature_encoder_': FunctionTransformer()
}

def fcn_model(hidden_layer_dim, meta):
    # note that meta is a special argument that will be
    # handed a dict containing input metadata
    n_features_in_ = meta["n_features_in_"]
    X_shape_ = meta["X_shape_"]
    n_classes_ = meta["n_classes_"]
    
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(n_features_in_, input_shape=X_shape_[1:]))
    model.add(keras.layers.Activation("relu"))
    model.add(keras.layers.Dense(hidden_layer_dim))
    model.add(keras.layers.Activation("relu"))
    model.add(keras.layers.Dense(n_classes_))
    model.add(keras.layers.Activation("softmax"))
    return model

def get_model_fcn(modelargs={}):
    model = fcn_model(128, meta)
    model.compile(optimizer='adam', 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])
    
    return model

... ### Preprocessing stuff...

X_train, X_test, y_train, y_test = splits

# Define the Pipeline    
## Classification    
model = get_model_fcn()
model.fit(X_train, y_train, epochs=10)

## Evaluation Metrics
#from sklearn.model_selection import cross_val_score
#score = cross_val_score(model, X_test, y_test, scoring='accuracy', cv=5, n_jobs=-1, #error_score='raise')
#print('Mean: %.3f (Std: %.3f)' % (np.mean(score), np.std(score)))

columns, report, true_matrix, pred_matrix = cl.classification_metrics(model, splits, score)

Expected output from using Keras:

Test Size:  0.2
Split Shapes:   [(79997, 96), (20000, 96), (79997, 12), (20000, 12)]
Epoch 1/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.6941 - accuracy: 0.3730
Epoch 2/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.3193 - accuracy: 0.5002
Epoch 3/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.2206 - accuracy: 0.5399
Epoch 4/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1585 - accuracy: 0.5613
Epoch 5/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.1221 - accuracy: 0.5758
Epoch 6/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0923 - accuracy: 0.5928
Epoch 7/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0682 - accuracy: 0.5984
Epoch 8/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0611 - accuracy: 0.6046
Epoch 9/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0445 - accuracy: 0.6138
Epoch 10/10
2500/2500 [==============================] - 3s 1ms/step - loss: 1.0236 - accuracy: 0.6186
Overall: 0.601  Kappa: 0.548
Weighted F1-Score: 0.600

Solution

Apparently it was a bug with how it handled multi-class one-hot encoded targets, issue handled here