I am trying to run the predict
function for a LDA model. I have two predictors x1
and x2
and a categorical response y
that takes values of -1 and 1. All parameters contains 500 datapoints. And I am splitting the dataset as follows:
xx = data.frame(cbind(x1,x2))
x = cbind(x1,x2)
x_train = x[1:350,]
x_test = x[351:N,]
y_train = y[1:350]
y_test = y[351:N]
Some output:
x1 x2 y
1 -1.1843924 1.920765 -1
2 3.3167508 2.321631 1
3 -3.0301378 5.973256 -1
4 -1.3262624 -2.320463 -1
5 -0.6534166 -3.050822 -1
6 -2.0051728 -4.118190 -1
Then I fit the LDA model and try the predict
function:
modelo.lda = lda(y_train~xx[1:350,1]+xx[1:350,2])
predict.lda = predict(modelo.lda, newdata=xx[351:N,])
Note: the xx
values are stated in that way following this answer for the same problem.
But there is where I get:
Warning message: 'newdata' had 150 rows but variables found have 350 rows
I thought that mantaining the same xx[init:end,]
form fixed the problem as the answer of this question stated but it seems it doesn't.
What could it be?
Thanks in advance.
As suggestion if you have train and test sets, it is better if you use them in this way so that you can avoid potential pitfalls. Try this:
library(MASS)
#Data
N <- 500
x1 <- rnorm(N,0,1)
x2 <- rnorm(N,1,5)
y <- round(runif(N,0,1),0)
xx = data.frame(x1,x2,y)
x_train = xx[1:350,]
x_test = xx[351:N,]
#Models
modelo.lda = lda(y_train~x1+x2,data = x_train)
predict.lda = predict(modelo.lda, newdata=x_test)
No warnings will we produced.