Search code examples
machine-learningnlpweka

svm file format in weka


I want to classify texts using svm (smo) in weka. The file I have, contains some sentences (Persian) and a word in front of each sentence which shows its class. The question is: should I change these sentences to a binary vector and give these vectors to weka as input or is it enough if I just turn the sentences to vector by choosing "string to word vector" in weka itself?

sample file:

https://www.dropbox.com/s/ohpyortve8jbwhe/shoor.arff?dl=0


Solution

  • Although, it works with choosing "string to word vector" in weka, it's better to change the sentences to vectors according to 1000 most frequent words or any other features. It works faster.