Search code examples
pythonmachine-learningscikit-learnclassificationprediction

Naive Bayes Gaussian Classification Prediction not taking array in SK-Learn


I have made the following gaussian prediction model in SK-learn:

chess_gnb = GaussianNB().fit(raw[['elo', 'opponent_rating', 'winner_loser_elo_diff']],raw['winner'])

I then made a test array and attempted to feed it into the model:

test1 = [['elo', 1000], ['opponent_rating', 800], ['winner_loser_elo_diff', 200]]
chess_gnb.predict(test1)

However, I'm getting this error:

ValueError: Unable to convert array of bytes/strings into decimal numbers with dtype='numeric'

The 'winner' prediction should be a string that can have one of two values. Why am I getting the valueError if all of my inputs are integers?


Solution

  • You need to provide a dataframe, using an example:

    import pandas as pd
    import numpy as np
    from sklearn.naive_bayes import GaussianNB
    
    np.random.seed(123)
    
    raw = pd.DataFrame(np.random.uniform(0,1000,(100,3)),
                       columns = ['elo','opponent_rating','winner_loser_elo_diff'])
    raw['winner'] = np.random.binomial(1,0.5,100)
    
    chess_gnb = GaussianNB().fit(raw[['elo', 'opponent_rating', 'winner_loser_elo_diff']],raw['winner'])
    

    This works:

    test1 = pd.DataFrame({'elo': [1000],'opponent_rating':[800],'winner_loser_elo_diff':[200]})
    chess_gnb.predict(test1)