Search code examples
rmlogit

Column of 1's and 0's to indicate individuals' choices across a range of alternatives?


I am trying to set up my data to work in the mlogit package in R

I have a dataframe created with the following code:

id <- 1:10
id <- rep(id, each=5)
site <- c("site1", "site2", "site3", "site4", "site5")
choice <- c("site3", "site5", "site1", "site4", "site2",
            "site4", "site3", "site5", "site2", "site1")
df <- cbind(id, site)

I want to create a binary variable that indicates site choice for every value of id. As the id variable is a repeated sequence, the new indicator variable needs to be 0 for every row except the one in which "site" is equivalent to the relevant value of "choice". For id == 1, this will be the first element of the "choice" vector. For id == 2, it will be the 2nd element of the choice vector and so on.

A final dataframe with the variable included should look like this:

      id   site   indicator
 [1,] "1"  "site1" "0"
 [2,] "1"  "site2" "0"
 [3,] "1"  "site3" "1"
 [4,] "1"  "site4" "0"
 [5,] "1"  "site5" "0"
 [6,] "2"  "site1" "0"
 [7,] "2"  "site2" "0"
 [8,] "2"  "site3" "0"
 [9,] "2"  "site4" "0"
[10,] "2"  "site5" "1"
[11,] "3"  "site1" "1"
[12,] "3"  "site2" "0"
[13,] "3"  "site3" "0"
[14,] "3"  "site4" "0"
[15,] "3"  "site5" "0"
[16,] "4"  "site1" "0"
[17,] "4"  "site2" "0"
[18,] "4"  "site3" "0"
[19,] "4"  "site4" "1"
[20,] "4"  "site5" "0"
[21,] "5"  "site1" "0"
[22,] "5"  "site2" "1"
[23,] "5"  "site3" "0"
[24,] "5"  "site4" "0"
[25,] "5"  "site5" "0"
[26,] "6"  "site1" "0"
[27,] "6"  "site2" "0"
[28,] "6"  "site3" "0"
[29,] "6"  "site4" "1"
[30,] "6"  "site5" "0"
[31,] "7"  "site1" "0"
[32,] "7"  "site2" "0"
[33,] "7"  "site3" "1"
[34,] "7"  "site4" "0"
[35,] "7"  "site5" "0"
[36,] "8"  "site1" "0"
[37,] "8"  "site2" "0"
[38,] "8"  "site3" "0"
[39,] "8"  "site4" "0"
[40,] "8"  "site5" "1"
[41,] "9"  "site1" "0"
[42,] "9"  "site2" "1"
[43,] "9"  "site3" "0"
[44,] "9"  "site4" "0"
[45,] "9"  "site5" "0"
[46,] "10" "site1" "1"
[47,] "10" "site2" "0"
[48,] "10" "site3" "0"
[49,] "10" "site4" "0"
[50,] "10" "site5" "0"

I have attempted this many times and cannot figure it out nor can I find a relevant answer online.

Thanks in advance :)


Solution

  • As Akrun suggested, use data.frame to define df:

    df <- data.frame(id, site)
    

    Then do:

    df$indicator <- (df$site == choice[df$id])*1
    

    The *1 will convert the T/F result to 1's and 0's

    Result:

       id  site indicator
    1   1 site1         0
    2   1 site2         0
    3   1 site3         1
    4   1 site4         0
    5   1 site5         0
    6   2 site1         0
    7   2 site2         0
    8   2 site3         0
    9   2 site4         0
    10  2 site5         1
    11  3 site1         1
    12  3 site2         0
    13  3 site3         0
    14  3 site4         0
    15  3 site5         0
    16  4 site1         0
    17  4 site2         0
    18  4 site3         0
    19  4 site4         1
    20  4 site5         0
    21  5 site1         0
    22  5 site2         1
    23  5 site3         0
    24  5 site4         0
    25  5 site5         0
    26  6 site1         0
    27  6 site2         0
    28  6 site3         0
    29  6 site4         1
    30  6 site5         0
    31  7 site1         0
    32  7 site2         0
    33  7 site3         1
    34  7 site4         0
    35  7 site5         0
    36  8 site1         0
    37  8 site2         0
    38  8 site3         0
    39  8 site4         0
    40  8 site5         1
    41  9 site1         0
    42  9 site2         1
    43  9 site3         0
    44  9 site4         0
    45  9 site5         0
    46 10 site1         1
    47 10 site2         0
    48 10 site3         0
    49 10 site4         0
    50 10 site5         0
    

    If you want strings instead of numbers or factors use as.character on the column you want to convert