python scikit-learn statistics naivebayes

How the Naive Bayes works

I already read about the naive bayes that it is a classification technique algorithm and can make predication based on the data you give, but in this example I just cant get it how the output [3,4] came.

Following the example:

#assigning predictor and target variables
x= np.array([[-3,7],[1,5], [1,2], [-2,0], [2,3], [-4,0], [-1,1], [1,1], [-2,2], [2,7], [-4,1], [-2,7]])
Y = np.array([3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4]

#Create a Gaussian Classifier
model = GaussianNB()

# Train the model using the training sets 
model.fit(x, y)

#Predict Output 
predicted= model.predict([[1,2],[3,4]])
print predicted

Output: ([3,4])

Can anyone explain how in this case [3,4] generated and what does this mean?

Solution

Please go through below sample code.

from sklearn.naive_bayes import GaussianNB
import numpy as np

model = GaussianNB()
#assigning predictor and target variables
x= np.array([[-3,7],[1,5], [1,2], [-2,0], [2,3], [-4,0], [-1,1], [1,1], [-2,2], [2,7], [-4,1], [-2,7]])
Y = np.array([3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4])
print x

model.fit(x, Y)

#Predict Output 
predicted= model.predict([[1,2],[35,6], [2,6]])
print predicted

output:

[3 4 4]

Here we got 3 values. 3 corresponds to [1,2], 4 corresponds to [35,6], etc.

So, based on your sample size, you can see that either you get 3 or 4 value. so depending on the test data it gave [3,4] for your test data. Hope this clarifies.

From your code, for example I just take first 3 entries. You can see the chart below. X_1 and X_2 are feature vector(Input) and Y is your output. Based on input and output the algorithm generate one mathematical formula. When you give your test data it uses same formula to generate output (Y).Thats how you got [3,4].