This is probably a newbie question on possible classification algorithm, so please bear with me. I have a dataset that comprises both nominal and numeric attribute which may look the example below (not actual dataset). What kind of algorithm would be best to predicate the class and get the accuracy (preferably in Python/Java)?
Classes: classA, classB, classC
attribute1: Recurrence <Yes, No>
attribute2: Subject <Math, Science, Geography>
attribute3: ProbabilityA <0.0 - 1.0>
atrribute4: ProbabilityB <0.0 - 1.0>
attribute5: ProbabilityC <0.0 - 1.0>
The nominal data can contain numeric value of [1,-1] where 1 represent present and -1 not present, or it can be a set of string values such as ['YES', 'NO'] or ['Type1', 'Type2', 'Type3']. The numeric value is used to express the likelihood of an attribute. For example [0-1], The closer the value to 1, the more likely it evaluate to true.
Well, this is by no means a "newbie question", and is in fact quite complicated. While Inti's suggestion is certainly a good start, it really depends upon so many factors that there is no easy "right answer".
Some things to consider:
Until some more info like this is known, it's tough to give very precise details. (In general, on this forum, the more effort you put into the question, the more effort others put into their answers.)
That being said, here are some buzz words to start looking up, to get your head around the possibilities:
The world of potential options in machine learning algos is pretty huge, nothing works perfectly, and nothing works equally well in all situations. This wiki page is not so great, but it's a decent start on finding algos.
Once you've decided whatever algo you think will work for your case, then look up a library / implementation in Python or Java or what-have-you. With SciPy and NumPy, you can assume that Python has a pretty large library of possibilities. I suspect Java also has a huge library, but I personally know Python far better.