I am trying to train a network on the Swiss Roll dataset with three features X = [x1, x2, x3] for the classification task. There are four classes with labels 1, 2, 3, 4, and the vector y contains the labels for all the data.
A row in the X matrix looks like this:
-5.2146470e+00 7.0879738e+00 6.7292474e+00
The shape of X is (100, 3), and the shape of y is (100,).
I want to use Radial Basis Functions to train this model. I have used the custom RBFLayer from this StackOverflow answer (also see this explanation) to build the RBFLayer. I want to use a couple of Keras Dense layers to build the network for classification.
I have used a Dense layer for the first layer, followed by the custom RBFLayer, and two other Dense layers. Here's the code:
model = Sequential()
model.add((Dense(100, input_dim=3)))
# number of units = 10, gamma = 0.05
model.add(RBFLayer(10,0.05))
model.add(Dense(15, activation='relu'))
model.add(Dense(1, activation='softmax'))
This model gives me zero accuracy. I think there is something wrong with the model architecture, but I can't figure out what is the issue.
Also, I thought the number of units in the last Dense layer should match the number of classes, which is 4 in this case. But when I set the number of units to 4 in the last layer, I get the following error:
ValueError: Shapes (None, 1) and (None, 4) are incompatible
Can you help me with this model architecture?
I faced the same issue while practicing with multi-class classification. Where I had 7 features and the model classifies into 7 classes. I tried encoding the labels and it fixed the issue.
First import LabelEncoder
class from sklearn
and import to_categorical
from tensorflow
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical
Then, initialize an object to the LabelEncoder class and transform your labels before fitting and training the model.
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)
y = to_categorical(y)
Note that you have to use np.argmax for getting the actual predicted classification. in my case, the prediction is stored in variable called res
res = np.argmax(res, axis=None, out=None)
You can get your actual predicted class after this line. Looking forward to help you. Hope it solved your problem.