Search code examples
rr-caretcaret

Caret: undefined columns selected


I try this model fit but it gives me this error :

Error in [.data.frame(data, , all.vars(Terms), drop = FALSE) : undefined columns selected

library(caret) ; library(kernlab) ; data(spam)

intrain <- createDataPartition(spam$type,p=0.75,list = F)
training <- spam\[intrain,]
test <- spam[-intrain,]

preProc <- preProcess(log10(training\[,-58\]+1),method="pca",pcaComp=2)
trainPC <- predict(preProc,log10(training\[,-58]+1))

Error in the line below

modelFit <- train(training$type~.,method="glm",data=trainPC)

Or

modelFit <- train(training$type~ .,method="glm",preProcess="pca",data=training)

Solution

  • While there are warnings (not the best method as data is categorical). This works:

    modelFit <- train(y=training$type,
                      x=trainPC,
                      method="glm")
    
    summary(modelFit)
    
    Call:
    NULL
    
    Deviance Residuals: 
        Min       1Q   Median       3Q      Max  
    -3.8587  -0.3801  -0.0874   0.1967   3.7056  
    
    Coefficients:
                Estimate Std. Error z value Pr(>|z|)    
    (Intercept) -1.07856    0.09537 -11.309   <2e-16 ***
    PC1         -1.79073    0.09707 -18.448   <2e-16 ***
    PC2          0.62770    0.06655   9.432   <2e-16 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    (Dispersion parameter for binomial family taken to be 1)
    
        Null deviance: 4628.1  on 3450  degrees of freedom
    Residual deviance: 1755.1  on 3448  degrees of freedom
    AIC: 1761.1
    
    Number of Fisher Scoring iterations: 7