I try this model fit but it gives me this error :
Error in
[.data.frame
(data, , all.vars(Terms), drop = FALSE) : undefined columns selected
library(caret) ; library(kernlab) ; data(spam)
intrain <- createDataPartition(spam$type,p=0.75,list = F)
training <- spam\[intrain,]
test <- spam[-intrain,]
preProc <- preProcess(log10(training\[,-58\]+1),method="pca",pcaComp=2)
trainPC <- predict(preProc,log10(training\[,-58]+1))
Error in the line below
modelFit <- train(training$type~.,method="glm",data=trainPC)
Or
modelFit <- train(training$type~ .,method="glm",preProcess="pca",data=training)
While there are warnings (not the best method as data is categorical). This works:
modelFit <- train(y=training$type,
x=trainPC,
method="glm")
summary(modelFit)
Call:
NULL
Deviance Residuals:
Min 1Q Median 3Q Max
-3.8587 -0.3801 -0.0874 0.1967 3.7056
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.07856 0.09537 -11.309 <2e-16 ***
PC1 -1.79073 0.09707 -18.448 <2e-16 ***
PC2 0.62770 0.06655 9.432 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 4628.1 on 3450 degrees of freedom
Residual deviance: 1755.1 on 3448 degrees of freedom
AIC: 1761.1
Number of Fisher Scoring iterations: 7