Search code examples
rfor-loopdataframesvmcbind

Analyzing the relationship between cost and gamma for the svm, (NAs are not allowed) error message


I have been trying to analyse the relationship between the cost parameter,c , and the gamma parameter using the ksvm package (kernlab). The program I have written is as follows:

function (data) { library(kernlab)

p<-ncol(data)
y<-data[,p]
x<-data[,-p]

Rad.gamma<-matrix(seq(exp(-10),exp(1),length=20))
Con.c<-matrix(c(0.1,0.5,1.5),nrow=1)
mat<-expand.grid(Rad.gamma,Con.c)
Output<-data.frame(0,nrow=80,ncol=2)

for(i in 1:80)
{
    Gamma<-mat[i,1]
    CC<-mat[i,2]
    Svm<-ksvm(y~.,data=as.data.frame(x),
              kernel="rbfdot",kpar=list(sigma=Gamma),
              cross=5, C=CC, type='C-svc',prod.model=FALSE)

    Output[i,1]<-error(Svm)
    Output[i,2]<-cross(svm)
    Output[i,3]<-nSV(svm)/nrow(data)
}
Output<-data.frame(Output)

results<-cbind(mat,Output)
colnames(results)<-c("C","Train","Cross","SVs")

results

}

The error I obtain is:

Error in votematrix[i, ret < 0] <- votematrix[i, ret < 0] + 1 : NAs are not allowed in subscripted assignments

I have attempted to check stackoverflow for a solution but the best answer I could find is that data.frame needs to come before cbind when there are missing values. I have been testing this function with the iris data set and there are no missing values. I would like to plot the results and analyze the patterns of the output matrix's contents; that should be simple enough. The problem is getting the results table to use for the plotting.

Any help would be greatly appreciated.


Solution

  • The mat made by expand grid has 60 rows, and you are attempting to find indexes of up to row 80. This should work:

    data(iris)
    library(kernlab)
    p<-ncol(iris)
    y<-iris[,p]
    x<-iris[,-p]
    
    Rad.gamma<-matrix(seq(exp(-10),exp(1),length=20))
    Con.c<-matrix(c(0.1,0.5,1.5),nrow=1)
    mat<-expand.grid(Rad.gamma,Con.c)
    Output<-data.frame(0,nrow=60,ncol=2)
    
    for(i in 1:60){
      Gamma<-mat[i,1]
      CC<-mat[i,2]
      Svm<-ksvm(y~.,data=as.data.frame(x),
                kernel="rbfdot",kpar=list(sigma=Gamma),
                cross=5, C=CC, type='C-svc',prod.model=FALSE)
    
      Output[i,1]<-error(Svm)
      Output[i,2]<-cross(Svm)
      Output[i,3]<-nSV(Svm)/nrow(iris)
    }
    Output<-data.frame(Output)
    
    results<-cbind(mat,Output)
    colnames(results)<-c("C","Train","Cross","SVs")
    
    results
    

    additionally, results have 5 columns, perhaps

    colnames(results)<-c("gamma", "C","Train", "Cross","SVs")
    

    may I suggest using apply instead of the for loop. In this case one needs not worry where to store the results:

    out = apply(mat, 1, function(p){
      Gamma<-p[1]
      CC<-p[2]
      Svm<-ksvm(y~.,data=as.data.frame(x),
                kernel="rbfdot",kpar=list(sigma=Gamma),
                cross=5, C=CC, type='C-svc',prod.model=FALSE)
    
      out = data.frame(error(Svm), cross(Svm), nSV(Svm)/nrow(iris))
      colnames(out) = c("train", "Cross","SVs")
      return(out)
    })
    
    out = do.call(rbind, out)
    out = data.frame(mat, out)