I'm comparing various classification algorithms for a project using knime. I was very happy with the results I got for Support Vector Machines (LibSVM). I then wanted to try hierarchical classification and installed the Rapidminer plugin for knime. To get things to work I first tested the SVM implementation without hierarchies.
Comparing the results of the knime LibSVM implementation and the rapidminer LibSVM implementation I noticed that the rapidminer implementation yielded worse results. The knime implementation in fact produced an error rate of approximately 2.4% while the rapidminer one produced an error rate of approx. 61%. Why is that? Am I doing something wrong?
I use C-SVC SVMs with linear kernel, 1.0 Cost, 0.001 epsilon and 80mb cache for both implementations.
The documents are wikipedia article texts, preprocessed, transformed to a binary document vector and labeled with some kind of type.
I hope you can help me.
You do not need to include the Row ID
s in this case (Row ID
tab, make to button show Do not use
by clicking on it in case it is Use
and the text field is not disabled), and you should not perform Nominal to...
transformations on them. After that, you should get similar results in both cases.