Search code examples
rsvmr-caret

Linear SVM and extracting the weights


I am practicing SVM in R using the iris dataset and I want to get the feature weights/coefficients from my model, but I think I may have misinterpreted something given that my output gives me 32 support vectors. I was under the assumption I would get four given I have four variables being analyzed. I know there is a way to do it when using the svm() function, but I am trying to use the train() function from caret to produce my SVM.

library(caret)

# Define fitControl
fitControl <- trainControl(## 5-fold CV
              method = "cv",
              number = 5,
              classProbs = TRUE,
              summaryFunction = twoClassSummary )

# Define Tune
grid<-expand.grid(C=c(2^-5,2^-3,2^-1))

########## 
df<-iris head(df)
df<-df[df$Species!='setosa',]
df$Species<-as.character(df$Species)
df$Species<-as.factor(df$Species)

# set random seed and run the model
set.seed(321)
svmFit1 <- train(x = df[-5],
                 y=df$Species,
                 method = "svmLinear", 
                 trControl = fitControl,
                 preProc = c("center","scale"),
                 metric="ROC",
                 tuneGrid=grid )
svmFit1

I thought it was simply svmFit1$finalModel@coefbut I get 32 vectors when I believe I should get 4. Why is that?


Solution

  • So coef is not the weight W of the support vectors. Here's the relevant section of the ksvm class in the docs:

    coef The corresponding coefficients times the training labels.

    To get what you are looking for, you'll need to do the following:

    coefs <- svmFit1$finalModel@coef[[1]]
    mat <- svmFit1$finalModel@xmatrix[[1]]
    
    coefs %*% mat
    

    See below for a reproducible example.

    library(caret)
    #> Loading required package: lattice
    #> Loading required package: ggplot2
    #> Warning: package 'ggplot2' was built under R version 3.5.2
    
    # Define fitControl
    fitControl <- trainControl(
      method = "cv",
      number = 5,
      classProbs = TRUE,
      summaryFunction = twoClassSummary
    )
    
    # Define Tune
    grid <- expand.grid(C = c(2^-5, 2^-3, 2^-1))
    
    ########## 
    df <- iris 
    
    df<-df[df$Species != 'setosa', ]
    df$Species <- as.character(df$Species)
    df$Species <- as.factor(df$Species)
    
    # set random seed and run the model
    set.seed(321)
    svmFit1 <- train(x = df[-5],
                     y=df$Species,
                     method = "svmLinear", 
                     trControl = fitControl,
                     preProc = c("center","scale"),
                     metric="ROC",
                     tuneGrid=grid )
    
    coefs <- svmFit1$finalModel@coef[[1]]
    mat <- svmFit1$finalModel@xmatrix[[1]]
    
    coefs %*% mat
    #>      Sepal.Length Sepal.Width Petal.Length Petal.Width
    #> [1,]   -0.1338791  -0.2726322    0.9497457    1.027411
    

    Created on 2019-06-11 by the reprex package (v0.2.1.9000)

    Sources

    1. https://www.researchgate.net/post/How_can_I_find_the_w_coefficients_of_SVM

    2. http://r.789695.n4.nabble.com/SVM-coefficients-td903591.html

    3. https://stackoverflow.com/a/1901200/6637133