Search code examples
rxgboost

The length of labels must equal to the number of rows in the input data


I don't know why I'm getting this error! My data training is a sparse matrix.

dim(training)
> 14407 161

dim(label.train)
> 14407 1

xgb.train <- xgb.DMatrix(data = training, label = label.train)
> Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) : 
The length of labels must equal to the number of rows in the input data

I have checked my data and:

  • label.train is a data.frame
  • training does not have all zero rows or columns
  • all values in training are numeric

PS. My data is huge so I can't post a reproducible code, just need tips for what might be wrong from those who have experienced this error.


Solution

  • You're getting the error because your labels are a data.frame. Passing them as a vector or a matrix works for me.

    vec_y <- mtcars$vs
    mat_y <- as.matrix(mtcars$vs)
    df_y  <- mtcars[,8,drop=FALSE] #column vs is the 8th column
    
    x <- as.matrix(mtcars[,-8])    #column vs is the 8th column
    
    #vector labels: works
    xgboost::xgb.DMatrix(data=x, label=vec_y)
    #matrix labels: works
    xgboost::xgb.DMatrix(data=x, label=mat_y)
    #df labels: doesnt work
    xgboost::xgb.DMatrix(data=x, label=df_y)