Is there a magic sequence of parameters to allow the model to infer correctly from the data it hasn't seen before?
from sklearn.neural_network import MLPClassifier
clf = MLPClassifier(
activation='logistic',
max_iter=100,
hidden_layer_sizes=(2,),
solver='lbfgs')
X = [[ 0, 0], # 2 samples, 3 features
[0, 1],
# [1, 0],
[1, 1]]
y = [0,
1,
# 1,
0] # classes of each sample
clf.fit(X, y)
assert clf.predict([[0, 1]]) == [1]
assert clf.predict([[1, 0]]) == [1]
How about to use kernel? Kernel is a way of a model to to extract the desirable features from data.
Generally used kernels may not satisfy your requirement.
I believe they try to find 'cut' hyperplane between one hyperplane which contains [0, 0]
and [1, 1]
and another hyperplane which contains [0, 1]
.
In 2-dimensional space, for example, one hyperplane is y = x
and another hyperplane is y = x + 1
. Then 'cut' hyperplane could be y = x + 1/2
.
So I suggest the following kernel.
def kernel(X1, X2):
X1 = np.array([[(x[0] - x[1]) ** 2] for x in X1])
X2 = np.array([[(x[0] - x[1]) ** 2] for x in X2])
return np.dot(X1, X2.T)
What this kernel does is this. It squares the different between two scalars; (x - y)2. With this way of feature extraction, data will be featurized like the following:
[0, 0]
→ [0]
[0, 1]
→ [1]
[1, 1]
→ [0]
And also for the unseen datum:
[1, 0]
→ [1]
So the following trained classifier will predict
s as you desire; ([1, 0]
→ [1]
).
clf = svm.SVC(kernel=kernel, max_iter=100)
Model selection is very important in machine learning. A model which does not know that [0, 0]
and [1, 1]
are in the same group and
[0, 1]
and [1, 0]
are in the other same group
may not make the prediction you expect.