Looking around I was not able to find a good way to use libsvm with Java and I still have some open questions:
1) It is possible to use only libsvm or I have to use also weka? If any, what's the difference?
2) When using String type data how can I pass the training set as Strings? I was using matlab for a similar problem for proteins classification and there I just gave the strings to the machine without problem. Is there a way to do this in Java?
Here is an incomplete example of what I did in matlab (it works):
[~,posTrain] = fastaread('dataset/1.25.1.3_d1ilk__.pos-train.seq');
[~,posTest] = fastaread('dataset/1.25.1.3_d1ilk__.pos-test.seq');
trainKernel = spectrumKernel(trainData,k);
testKernel = spectrumKernel(testData,k);
trainKf =[(1:length(trainData))', trainKernel];
testKf = [(1:length(testData))', testKernel];
disp('custom');
model = libsvmtrain(trainLabel,trainKf,'-t 4');
[~, accuracy, ~] = libsvmpredict(testLabel,testKf,model)
As you can see I read the file in fasta format and feed them to libsvm but libsvm for java look like it wants something called Node that is made of double. What I did is to take byte[] from the String and then transform them into Double. Is it correct?
3) How to use a custom kernel? I've found this line of code
KernelManager.setCustomKernel(custom_kernel);
but with my libsvm.jar I don't find. Which lib do I have to use?
Sorry for the multiple questions, I hope you will give me a brief overview of what is going on here. Thanks.
Please note that I've used LIBSVM for MATLAB, but not for Java. I can only really answer question 1, but hopefully this still helps: