how to use string data for svm (smo) in weka

I have an arff file containing some sentences (Persian language) and a word in front of each sentence which shows its class in @data part. I need to use smo for classification. The questions:

1) Is it necessary to change the sentences to vectors ?

2) I selected "string to word vector", but the smo is inactive and still doesn't work. (and of course other algorithms like naive bayes).

How can I use this text data with smo ?

enter image description here

The above picture is a very small sample file.

file sample: https://www.dropbox.com/s/ohpyortve8jbwhe/shoor.arff?dl=0

Solution

First, you need apply "string to word vector" filter. After, on classify tab, you need to change the target class to "(Nom) class". This is enought to enable the naive bayes and SVM algorithms. I downloaded the dataset, and it worked well.

You can follow this tutorial: https://www.youtube.com/watch?v=zlVJ2_N_Olo

Hope it can help you

from sklearn.feature_extraction.text import TfidfVectorizer
import arff
from sklearn import svm
import numpy as np
from sklearn.model_selection import train_test_split

data=list(arff.load('shoor.arff'))

text=[]
label=[]
for r in data:
    if (len(r)>1):
        text.append(r[0])
        label.append(r[1])
tfidf = TfidfVectorizer().fit_transform(text)
features = (tfidf * tfidf.T).A


X_train, X_test, y_train, y_test = train_test_split(features, label, test_size=0.5, random_state=0)
clf = svm.SVC(kernel='linear', C=1).fit(X_train, y_train)
clf.score(X_test, y_test)

1.0