Variable lengths differ error message in R

I am running a logistic regression on the spam dataset from https://hastie.su.domains/ElemStatLearn/. The dependent variable is in the last column, which is given as V58 after I import the data in R. But I get the error message

variable lengths differ (found in 'V1')

I've checked for NAs using na.omit, I tried removing the V1 column just to see if that fixes the issue, but I get the same message. I tried using the old cv.glm function instead, but that does not work. I also tried referring to the dependent variable as spam.data$V58. I am stuck. What am I missing? Below is the code I have so far to just get the model.

spam.data <- data_frame(read.table(datapathname))

dim(spam.data)
str(spam.data)
summary(spam.data)

set.seed(2718)
row.number = sample(1:nrow(spam.data), 0.7*nrow(spam.data))
train = spam.data[row.number,]
test = spam.data[-row.number,]
dim(train)
dim(test)

model.logistic = glm(as.factor(spam.data[58])~., data=train, family=binomial) #The error gets thrown here.

summary(model.logistic)

I should also say that the way I created the data set was to copy and paste the data from the website to a text file and read it into R from that file. The info on the site says we should get 4601 rows and 58 columns, and indeed we get this size dataset.

Solution

There are two mistakes when you fit the logistic model:

The data is train, but you take the outcome from spam.data. These two data frames have different numbers of observations. Instead, use train[58] as the outcome.
When you change the outcome to a factor with as.factor, a single level NA is returned, because the as.factor does not work on the dbl format of the outcome. Therefore, first unlist the outcome with unlist(train[58]).

Using these two changes the model worked for me:

model.logistic = glm(as.factor(unlist(train[58]))~., data=train, family=binomial)