I am very new to ml in R and am trying to simply add all variables from X_train
to predict y_train
in the model training. I am running into problem with them not being in the same data.frame
. My code is as such:
logitmod <- glm(log_y_train ~ log_X_train, family = "binomial")
log_y_train
is a factor of length 200386 and log_X_train
is a data.frame
of 174 variables and 200386 rows. It is for this reason I cannot simply type all column names.
However I get the following error:
invalid type (list) for variable 'log_X_train'
I thought this was a dataframe but nonetheless tried unlist()
when then told me lengths differed. Can anyone help to fix this issue to use both variables in the logit.
Thanks
Bind log_y_train
and log_X_train
into a data.frame
so that you can use " ~ ."
in a formula
to represent all variables in log_X_train
.
glm(log_y_train ~ ., family = binomial(), data = cbind(log_y_train, log_X_train))
Use reformulate()
to create a formula
with all variables in log_X_train
as predictors and log_y_train
as response. This one has no need to bind log_y_train
and log_X_train
.
glm(reformulate(names(log_X_train), "log_y_train"), family = binomial(), data = log_X_train)