Search code examples
rcsvglmnet

How to load a csv file into R as a factor for use with glmnet and logistic regression


I have a csv file (single column, numeric values) called "y" that consists of zeros and ones where the rows with the value 1 indicate the target variable for logistic regression, and another file called "x" with the same number of rows and with columns of numeric predictor values. How do I load these so that I can then use cv.glmnet, i.e.

x <- read.csv('x',header=FALSE,sep=",")
y <- read.csv('y',header=FALSE )

is throwing an error

Error in y %*% rep(1, nc) : 
requires numeric/complex matrix/vector arguments

when I call

cvfit = cv.glmnet(x, y, family = "binomial")

I know that "y" should be loaded as a "factor," but how do I do this? My online searches have found all sorts of approaches that have just confused me. What is the simple one-liner to just load this data ready for glmnet?


Solution

  • The cv.glmnet requires data to be provided in vector or matrix format. You can use the following code

    xmat = as.matrix(x)

    yvec = as.vector(y)

    Then use

    cvfit = cv.glmnet(xmat, yvec, family = "binomial")

    If you can provide your data in dput() format, I can give a try.