Search code examples
pythonscikit-learnstatisticsnaivebayes

How the Naive Bayes works


I already read about the naive bayes that it is a classification technique algorithm and can make predication based on the data you give, but in this example I just cant get it how the output [3,4] came.

Following the example:

#assigning predictor and target variables
x= np.array([[-3,7],[1,5], [1,2], [-2,0], [2,3], [-4,0], [-1,1], [1,1], [-2,2], [2,7], [-4,1], [-2,7]])
Y = np.array([3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4]

#Create a Gaussian Classifier
model = GaussianNB()

# Train the model using the training sets 
model.fit(x, y)

#Predict Output 
predicted= model.predict([[1,2],[3,4]])
print predicted

Output: ([3,4])

Can anyone explain how in this case [3,4] generated and what does this mean?


Solution

  • Please go through below sample code.

    from sklearn.naive_bayes import GaussianNB
    import numpy as np
    
    model = GaussianNB()
    #assigning predictor and target variables
    x= np.array([[-3,7],[1,5], [1,2], [-2,0], [2,3], [-4,0], [-1,1], [1,1], [-2,2], [2,7], [-4,1], [-2,7]])
    Y = np.array([3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4])
    print x
    
    model.fit(x, Y)
    
    #Predict Output 
    predicted= model.predict([[1,2],[35,6], [2,6]])
    print predicted
    

    output:

    [3 4 4]
    

    Here we got 3 values. 3 corresponds to [1,2], 4 corresponds to [35,6], etc.

    So, based on your sample size, you can see that either you get 3 or 4 value. so depending on the test data it gave [3,4] for your test data. Hope this clarifies.

    From your code, for example I just take first 3 entries. You can see the chart below. X_1 and X_2 are feature vector(Input) and Y is your output. Based on input and output the algorithm generate one mathematical formula. When you give your test data it uses same formula to generate output (Y).Thats how you got [3,4].

    enter image description here