I'm trying to do a logistic regression with a subset of the data. This is my code :
reg1 <- glm(smoke_binary~ Age + Marital.Status + Highest.Qualification,
data = subset(uf_train,(uf_train$Marital.Status=="Married" &
uf_train$Marital.Status=="Separated" &
uf_train$Marital.Status=="Widowed" &
uf_train$Marital.Status== "Divorced" &
uf_train$Highest.Qualification=="GCSE/CSE" &
uf_train$Highest.Qualification=="O Level" &
uf_train$Highest.Qualification=="A Levels")),
family=binomial)
But i keep on getting this error. I don't know what it means or how I can fix it:
Error in
contrasts<-
(*tmp*
, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels
Your subset selection is off. By using &
("AND") for mutually exclusive levels you're actually ending up with an empty data set (like saying "select all the M&Ms that are green AND brown"). When debugging, it helps to do the subset selection separately, so that you can check the results ...
glm_data <- subset(uf_train,
Marital.Status %in% c("Widowed", "Married", "Separated", "Divorced") &
Highest.Qualification %in% c("GCSE/CSE", "O Level", "A Levels"))
nrow(glm_data)
table(glm_data$Marital.Status)
table(glm_data$Highest.Qualification)