Search code examples
machine-learningtraining-dataencog

Selecting a machine learning training method


I have the following data which has already been normalized:

  • customer id
  • customer age
  • customer location
  • home owner
  • car vale
  • risk factor
  • married
  • package a
  • package b
  • package c

Based on all the factors above, I would like to predict what packages; either A, B or C, a customer is likely to purchase.

However, I am sort of lost in a sea of options. There are many training methodologies such as Linear perceptron, genetic algorithm, time series forecasting, auto-associative networks, and many more.

How do I know which one is likely to work best for solving this type of problem where there is more than one output?

Edit:

My question is based on the assumption that there is an optimal strategy for this particular scenario because I understand that certain algorithms are used more often in certain scenarios, such as genetic algorithms are used often in handwriting recognition programs.


Solution

  • So I'd recommend looking up the no free lunch theorem. Effectively, you can't trivially identify the "best classifier" for a problem. Personally, I would use scikit-learn and test out a bunch of classifiers with proper training, testing, and cross-validation sets and see what the best result looks like.

    Also, it depends on your case. Can users purchase multiple packages or no?