I am trying to build and train a machine learning data science algorithm that correctly predicts what presidential won in what county. I have the following information for training data.
Total population Median age % BachelorsDeg or higher Unemployment rate Per capita income Total households Average household size % Owner occupied housing % Renter occupied housing % Vacant housing Median home value Population growth House hold growth Per capita income growth Winner
I am new to data science. I do know Naive Bayes is a good classifier for algorithms trying to predict with multiple properties. However, I read the first step for a naive bayes classifier requires a frequency table. My problem is all of the above properties are continuous numerical properties and don't fall into "Yes" or "No" categories. Do I not use a Naive Bayes classifier then?
I also considered using a k nearest neighbor algorithm, but that doesn't look like it will be the most accurate and weight the properties correctly for me...I am looking for a supervised algorithm because I have training data. Can anyone give me any recommendations as to what algorithm to use? In addition, being new to the field, how can I figure out what algorithm to use on my own in the future.
You can use artificial neural networks.
To create, train, test and evaluate neural networks you can use a couple of libraries: