Search code examples
rdataframedebuggingmlogit

Trying to create a data frame for mlogit and keep running into this error Error in names(data)[ix] : invalid subscript type 'language'


I am trying to use this data set https://data.cityofnewyork.us/Transportation/Citywide-Mobility-Survey-Person-Survey-2019/6bqn-qdwq to create an mnl model but every time I try to change my original data frame like this

nydata_df = dfidx(nydata, shape="wide",choice="work_mode",varying = sort)

I get this error here.

Error in names(data)[ix] : invalid subscript type 'language'

I'm unclear about what is causing this error I think it is something wrong with dplyr but I am not sure.


Solution

  • According to this vignette from the mlogit package, the varying argument should be used to specify which variables should be "lengthened" when converting a dataframe from wide to long using dfidx. Are you actively trying to lengthen your dataframe (like in the style of dplyr::pivot_longer())?

    If you aren't, I don't believe that you need the varying argument (see ?stats::reshape for more info on varying). If you want to use the varying argument, you should specify specific variables rather than only "sort" (example1, example2). Additionally, when I run your models, I don't get a NaN for McFadden's R2, p-value, or chi-square test. Are your packages fully updated?

    library(dfidx)
    library(mlogit)
    library(performance) # to extract McFadden's R2 easily
    
    packageVersion("dfidx")
    #> [1] '0.0.5'
    packageVersion("mlogit")
    #> [1] '1.1.1'
    packageVersion("dplyr")
    #> [1] '1.0.10'
    # currently running RStudio Version 2022.7.2.576
    
    nydata <- read.csv(url("https://data.cityofnewyork.us/api/views/6bqn-qdwq/rows.csv?accessType=DOWNLOAD"))
    nydata_df <- dfidx(data = nydata, 
                       shape = "wide",
                       choice = "work_mode")
    
    m <- mlogit(work_mode ~ 1, nydata_df)
    #summary(m)
    r2_mcfadden(m)
    #> McFadden's R2 
    #>  1.110223e-16
    m3 <- mlogit(work_mode ~ 1 | harassment_mode + age, nydata_df)
    #summary(m3)
    r2_mcfadden(m3)
    #> McFadden's R2 
    #>    0.03410362