I had the following dataset
library(MASS)
install.packages("gclus")
data(wine)
View(wine)
install.packages("car")
I wanted to split it according to the proportions 70:30 into a training and a test set. Also I wanted to carry out LDA for the following data subsets
wine[c("Class", "Malic", "Hue", "Magnesium")]
wine[c("Class","Hue", "Alcalinity", "Phenols", "Malic", "Magnesium", "Intensity", "Nonflavanoid","Flavanoids")]
Lastly, I was using the function predict to predict the class memberships for the test data, and compare the predictions with the true class memberships.
I am getting some errors while doing it, so any help would be appreciated.
First split the data in train and test 70:30 like this:
library(MASS)
library(gclus)
set.seed(123)
ind <- sample(2, nrow(wine),replace = TRUE, prob = c(0.7, 0.3))
training <- wine[ind==1,]
testing <- wine[ind==2,]
Next, you can use the function lda
to perform a Linear discriminant analysis like this:
model1 <- lda(Class ~ Malic + Hue + Magnesium, training)
model2 <- lda(Class ~ Hue + Alcalinity + Phenols + Malic + Magnesium + Intensity + Nonflavanoid + Flavanoids, training)
At last you can predict on testset and check the results with a confusion matrix like this:
p1 <- predict(model1, testing)$class
tab <- table(Predicted = p1, Actual = testing$Class)
tab
Output:
Actual
Predicted 1 2 3
1 13 3 0
2 5 14 0
3 0 2 11
The accuracy is:
cat("Accuracy is:", sum(diag(tab))/sum(tab))
Accuracy is: 0.7916667