I try to get my data to work with the mlogit-package in r. I failed in converting the wide data format to a long format with the mlogit.data
command, so I tried it by myself using melt
.
This is what I have so far (case
is a case identifier, dv
will be the dependent variable, table
is the data in wide format, newdata
in long format):
case<-c(1,2,3)
dv<-c(1,2,3)
table<-as.data.frame(cbind(IssueID, dv))
newdata<-melt(setDT(table), id.vars = c("IssueID"), measure.vars = c("dv"))
Wide format:
case dv
1: 1 1
2: 2 2
3: 3 3
Long format:
IssueID variable value
1: 1 dv 1
2: 2 dv 2
3: 3 dv 3
However, to run the data with mlogit
, I need a dataset that contains all values of the dependent variable for each case and a dummy that stores the information which of these alternatives was chosen by the unit of observation.
The usable data should look like this:
#case2<-c(1,1,1,2,2,2,3,3,3)
#variable2<-(c("dv","dv","dv","dv","dv","dv","dv","dv","dv"))
#value2<-c(1,2,3,1,2,3,1,2,3)
#choice2<-c(1,0,0,0,1,0,0,0,1)
#newdata2<-as.data.frame(cbind(case2, variable2,value2,choice2))
case2 variable2 value2 choice2
1 1 dv 1 1
2 1 dv 2 0
3 1 dv 3 0
4 2 dv 1 0
5 2 dv 2 1
6 2 dv 3 0
7 3 dv 1 0
8 3 dv 2 0
9 3 dv 3 1
Do you have any suggestions for a code that does that, so that I don't have to code the choice variable manually? Thank you for your assistance.
Probably, you can achieve that from long format of the data using complete
and fill
.
library(dplyr)
library(tidyr)
df %>%
mutate(choice = 1) %>%
complete(IssueID, value = seq(min(value), max(value)),
fill = list(choice = 0)) %>%
fill(variable)
# IssueID value variable choice
# <int> <int> <fct> <dbl>
#1 1 1 dv 1
#2 1 2 dv 0
#3 1 3 dv 0
#4 2 1 dv 0
#5 2 2 dv 1
#6 2 3 dv 0
#7 3 1 dv 0
#8 3 2 dv 0
#9 3 3 dv 1
data
df <- structure(list(IssueID = 1:3, variable = structure(c(1L, 1L,
1L), .Label = "dv", class = "factor"), value = 1:3),
class = "data.frame", row.names = c(NA, -3L))