Search code examples
rnnet

Predicting with a model created using multinom hangs and errors


I'm trying to use the nnet library to create a multinomial logistic regression model from my training data to see if I can use it to predict my test data.

I set everything up in R using this script:

library(nnet)

folder <- "C:/***/"
trainingfile <- "training-set.txt"
testfile <- "test-set.txt"

train <- read.table(paste(folder, trainingfile, sep=''), sep=",", header=FALSE)
train.classes <- t(train[1:1])
train.data <- train[2:16]

test <- read.table(paste(folder, testfile, sep=''), sep=",", header=FALSE)
test.classes <- t(test[1:1])
test.data <- test[2:16]

train.model <- multinom(V1 ~ ., train, maxit=450) #converges after roughly 430 iterations

This all works well and the function multinom reports convergence.

To use the model to predict to classify the test data I use:

predictions <- predict(train.model, test.data)

However I'm then greeted with the error Error in eval(expr, envir, enclos) : object 'V17' not found. However when I inspect train.model I see that there is indeed an object 'V17'

> train.model
Call:
multinom(formula = V1 ~ ., data = train, maxit = 450)

Coefficients:
  (Intercept)        V2 
B -12.9514837 1.0668464 
C -48.1154774 1.6160071 
D  -2.2901219 1.0062945 
E -39.4371326 0.6848848 
F -20.6759707 0.8613838 
G -21.4471217 1.2858480 
H -17.4302527 0.8102932 
I  -4.7391825 1.3124087 
J -12.3513130 1.1404751 
K -13.9557738 0.7574471 
L  -0.4915034 0.7191369 
M -14.0855382 0.8888810 
N  -0.4372225 0.6041747 
O -18.2596753 1.2708861 
P  -9.8504326 1.2672870 
Q -20.9940977 1.8104502 
R  -5.8030089 0.8677690 
S -12.9944084 0.8097735 
T -32.5636344 1.8977861 
U  -9.1752184 1.6059663 
V -13.5695897 1.4547335 
W  -6.2590220 1.1292715 
X  -4.5939135 0.7603754 
Y -15.6763068 1.6498374 
Z -37.1840564 0.7382329 

*SNIP*

         V17
  1.63319426
  1.93093207
  0.80392847
  1.79189803
  1.32248565
  1.72440154
  1.22022835
  1.03014847
  0.20977345
  2.40335443
  1.17253978
  0.65072776
  0.46675729
  1.16579165
  1.50787334
  1.41267773
  1.71666099
  0.72543894
  0.64857852
  0.32401569
  1.33290027
  0.83846524
  1.02863203
 -0.05005955
  0.13792242

Residual Deviance: 26196.1 
AIC: 27046.1 

This is very strange, I now have no clue why the error is occurring. Anyway to get more data I tried calling summary(train.model) but that just totally hangs R forever. I've tried both the 32b and 64b versions of R 2.15.2 (the latest stable version) and the result is the same. Does anybody have a clue how I can resolve the errors/hangs and how I can rightly predict using the model created by multinom?


Solution

  • Summarizing from comments above:

    Ensure that the following is true:

    all(names(train)[-1] %in% names(test.data)) # [-1] to ignore V1
    

    Otherwise, predict will throw an error.

    And to add a little bit of value: In my experience, the reason for summary.multinom taking such a long time is that vcov.multinom is being called and the Hessian is being calculated. If you're making multiple calls to summary(train.model), it would make sense to calculate the Hessian in the call to multinom (which may still take a while):

    train.model <- multinom(V1 ~ ., train, maxit=450, Hess = TRUE)