Search code examples
rmlogit

Why will this csv file not load in R mlogit?


SOLVED

This has been blowing up my head all afternoon I simply cannot figure out why I cannot run mlogit on this simple data set

small snip

race,horseno,place,win,
1,1,4,0,1,0.7,1,0.33,0.13,0.09,0.72,1
1,2,2,0,0.45,0.78,0.99,0.5,0.22,0.2,0.73,0.98
1,3,1,1,0.42,1,0.99,1,0.18,0.1,0.73,0.76
1,4,3,0,0.19,0.27,0.99,0.17,0.22,0.12,0.73,0.47

can be found here, this exact csv doesn't work when I run and this is the error.

> x <- mlogit.data(data, choice = "win", shape = "long", id.var = "race", alt.var = "horseno")
Error in `$<-.data.frame`(`*tmp*`, "id1", value = c(1L, 1L, 1L, 1L, 1L,  : 
  replacement has 22 rows, data has 26

Honestly if anyone can save me I'd appreciate it


Solution

  • I don't fully understand the choice situations from your data. Nevertheless, I tried to compare your data with TravelMode data in this tutorial. It seems that race in your data is similar to individual in TravelMode data, which is the name of the variable that contains the information about the choice situations. So, I assume that race can be assigned to chid.var. Here is my trial:

    dat = read.table(text = "race,horseno,place,win,of,ppf,orf,df,jf,tf,wf,af
    1,1,4,0,1,0.7,1,0.33,0.13,0.09,0.72,1
    1,2,2,0,0.45,0.78,0.99,0.5,0.22,0.2,0.73,0.98
    1,3,1,1,0.42,1,0.99,1,0.18,0.1,0.73,0.76
    1,4,3,0,0.19,0.27,0.99,0.17,0.22,0.12,0.73,0.47
    2,1,2,0,1,1,1,1,0.31,0.16,0.61,0.81
    2,2,4,0,0.24,0.88,1,1,0.09,0.07,0.61,0.92
    2,3,1,1,0.16,0.03,1,1,0.57,0.29,0.61,0.98
    2,4,5,0,0.21,0.47,1,1,0.25,0.05,0.61,0.92
    2,5,8,0,0.01,0.3,1,1,0.19,0,0.64,0.92
    2,6,7,0,0.01,0.21,1,1,0.2,0,0.61,1
    2,7,3,0,0.1,0.34,1,1,0.16,0.04,0.58,0.79
    2,8,11,0,0.06,0.03,1,1,0.21,0.16,0.61,0.92
    2,9,10,0,0.03,0.03,1,1,0.19,0.16,0.61,0.92
    2,10,9,0,0.01,0.29,1,1,0.09,0.05,0.61,0.77
    2,11,6,0,0.01,0.25,1,1,0.09,0.05,0.61,0.77", header = TRUE, sep = ",")
    
    x <- mlogit.data(dat, choice = "win", shape = "long", 
                     chid.var = "race", alt.var = "horseno")
    x
    # ~~~~~~~
    #   first 10 observations out of 15 
    # ~~~~~~~
    #   race horseno place   win   of  ppf  orf   df   jf   tf   wf   af idx
    # 1     1       1     4 FALSE 1.00 0.70 1.00 0.33 0.13 0.09 0.72 1.00 1:1
    # 2     1       2     2 FALSE 0.45 0.78 0.99 0.50 0.22 0.20 0.73 0.98 1:2
    # 3     1       3     1  TRUE 0.42 1.00 0.99 1.00 0.18 0.10 0.73 0.76 1:3
    # 4     1       4     3 FALSE 0.19 0.27 0.99 0.17 0.22 0.12 0.73 0.47 1:4
    # 5     2       1     2 FALSE 1.00 1.00 1.00 1.00 0.31 0.16 0.61 0.81 2:1
    # 6     2       2     4 FALSE 0.24 0.88 1.00 1.00 0.09 0.07 0.61 0.92 2:2
    # 7     2       3     1  TRUE 0.16 0.03 1.00 1.00 0.57 0.29 0.61 0.98 2:3
    # 8     2       4     5 FALSE 0.21 0.47 1.00 1.00 0.25 0.05 0.61 0.92 2:4
    # 9     2       5     8 FALSE 0.01 0.30 1.00 1.00 0.19 0.00 0.64 0.92 2:5
    # 10    2       6     7 FALSE 0.01 0.21 1.00 1.00 0.20 0.00 0.61 1.00 2:6
    # 
    # ~~~ indexes ~~~~
    #   chid alt
    # 1     1   1
    # 2     1   2
    # 3     1   3
    # 4     1   4
    # 5     2   1
    # 6     2   2
    # 7     2   3
    # 8     2   4
    # 9     2   5
    # 10    2   6
    # indexes:  1, 2