Search code examples
rreshapetidyrmlogit

Trying to maintain index when using gather


I am trying to convert my data from wide to long, but for some reason, the ID column does not show up after the conversion. This is what my data looks like:

> head(dca)

# A tibble: 6 x 11
  ResponseId  Q9        Q10       Q11       Q12       Q13       Q14       Q15      Q16      Q17      Q18     
  <chr>       <chr>     <chr>     <chr>     <chr>     <chr>     <chr>     <chr>    <chr>    <chr>    <chr>   
1 "Response … "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regardi… "Regard… "Regard… "Regard… "Regard…
2 "{\"Import… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Impo… "{\"Imp… "{\"Imp… "{\"Imp… "{\"Imp…
3 "R_2V7lrA7… "Using F… "Using T… "Using Y… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
4 "R_3nozPOT… "Using F… "Using T… "Using T… "Using T… "Using T… "Using I… "Using … "Using … "Using … "Using …
5 "R_2TB0Wwy… "Using Y… "Using T… "Using T… "Using T… "Using T… "Using F… "Using … "Using … "Using … "Using …
6 "R_2woFtS9… "Using Y… "Using T… "Using Y… "Using T… "Using I… "Using I… "Using … "Using … "Using … "Using …

After applying this following transformation:

library(tidyr)
keycol <- "ResponseId"
valuecol <- "Response"
gathercols <- c("Q9","Q10","Q11","Q12","Q13","Q14","Q15","Q16","Q17","Q18" )
dca_long<- gather_(dca,dca$ResponseId, keycol, valuecol, gathercols)

This is what I get:

> head(dca_long)
# A tibble: 6 x 2
  ResponseId 'Response`                                                               
  <chr>      <chr>                                                                                
1 Q9         "Regarding the use of social media, which of the following options would you prefer?"
2 Q9         "{\"ImportId\":\"QID12\"}"                                                           
3 Q9         "Using Facebook on PC for utility"                                                   
4 Q9         "Using Facebook on PC for utility"                                                   
5 Q9         "Using Youtube on mobile for entertainment"                                          
6 Q9         "Using Youtube on mobile for entertainment"      

Essentially, I want there to be a column in dca_long to have a column where the values of ResponseId from dca are matched. I am doing this so I can further make dca suitable for mlogit().

Someone in the comment requested for this output to understand the code better:

> dput(head(dca))
structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}", 
"R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg", 
"R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility", 
"Using Facebook on PC for utility", "Using Youtube on mobile for entertainment", 
"Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility", 
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
"Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility", 
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
"Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility", 
"Using Twitter on mobile for utility", "Using Twitter on mobile for utility", 
"Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility", 
"Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility", 
"Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility", 
"Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment", 
"Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment", 
"Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
"Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment", 
"Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility", 
"Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment", 
"Using Youtube on PC for entertainment", "Using Instagram on mobile for utility", 
"Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?", 
"{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment", 
"Using Youtube on PC for utility", "Using Instagram on mobile for entertainment", 
"Using Instagram on mobile for entertainment")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))


Solution

  • this example might help you to solve your problem.

    #simulated wide data for reproducibility
    wide_data <- read.table(header=TRUE, text='
     subject sex time_1  time_2  time_3   
           1   M     15   16    23 
           2   F     25   20    48 
           3   F     30   25    55 
           4   M     35   32    60 
    ')
    

    using gather you should get something like the following.

    gather(data = olddata_wide, 
           key = alternative,
           value = time, 
           c(time_1, time_2, time_3), 
           factor_key=TRUE) 
    
    
       subject sex alternative time
    1        1   M      time_1   15
    2        2   F      time_1   25
    3        3   F      time_1   30
    4        4   M      time_1   35
    5        1   M      time_2   16
    6        2   F      time_2   20
    7        3   F      time_2   25
    8        4   M      time_2   32
    9        1   M      time_3   23
    10       2   F      time_3   48
    11       3   F      time_3   55
    12       4   M      time_3   60
    

    If this doesn't help. Please copy a snippet of your data (dca) to work things out in it. Best!

    [EDITED]

    Using the data you posted:

    df<- structure(list(ResponseId = c("Response ID", "{\"ImportId\":\"_recordId\"}", 
                                  "R_2V7lrA7n29xU0i6", "R_3nozPOTbJBE1OBa", "R_2TB0WwyWCugTyEg", 
                                  "R_2woFtS93jHyiv8F"), Q9 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                               "{\"ImportId\":\"QID12\"}", "Using Facebook on PC for utility", 
                                                               "Using Facebook on PC for utility", "Using Youtube on mobile for entertainment", 
                                                               "Using Youtube on mobile for entertainment"), Q10 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                     "{\"ImportId\":\"QID13\"}", "Using TikTok on PC for utility", 
                                                                                                                     "Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
                                                                                                                     "Using Twitter on mobile for entertainment"), Q11 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                           "{\"ImportId\":\"QID14\"}", "Using Youtube on PC for utility", 
                                                                                                                                                                           "Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
                                                                                                                                                                           "Using Youtube on PC for utility"), Q12 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                       "{\"ImportId\":\"QID15\"}", "Using Twitter on mobile for utility", 
                                                                                                                                                                                                                       "Using Twitter on mobile for utility", "Using Twitter on mobile for utility", 
                                                                                                                                                                                                                       "Using Twitter on mobile for utility"), Q13 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                       "{\"ImportId\":\"QID16\"}", "Using TikTok on Mobile for utility", 
                                                                                                                                                                                                                                                                       "Using TikTok on Mobile for utility", "Using TikTok on Mobile for utility", 
                                                                                                                                                                                                                                                                       "Using Instagram on PC for entertainment"), Q14 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                           "{\"ImportId\":\"QID17\"}", "Using Instagram on PC for utility", 
                                                                                                                                                                                                                                                                                                                           "Using Instagram on PC for utility", "Using Facebook on Mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                           "Using Instagram on PC for utility"), Q15 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                         "{\"ImportId\":\"QID18\"}", "Using Twitter on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                         "Using Twitter on mobile for entertainment", "Using Twitter on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                         "Using Facebook on mobile for utility"), Q16 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                                                                          "{\"ImportId\":\"QID19\"}", "Using Facebook on PC for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                          "Using Facebook on PC for entertainment", "Using Tiktok on mobile for utility", 
                                                                                                                                                                                                                                                                                                                                                                                                                          "Using Facebook on PC for entertainment"), Q17 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             "{\"ImportId\":\"QID20\"}", "Using Youtube on PC for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             "Using Youtube on PC for entertainment", "Using Instagram on mobile for utility", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             "Using Instagram on mobile for utility"), Q18 = c("Regarding the use of social media, which of the following options would you prefer?", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "{\"ImportId\":\"QID21\"}", "Using Instagram on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "Using Youtube on PC for utility", "Using Instagram on mobile for entertainment", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "Using Instagram on mobile for entertainment")), row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              -6L), class = c("tbl_df", "tbl", "data.frame"))
    

    and using parts of the snippet I posted earlier:

    df_long<- gather(data = df, 
           key = alternative,
           value = value_answer, 
           Q9:Q18, 
           factor_key=TRUE) 
    

    You should be able to get something like the following, which keeps the Response ID variable:

      ResponseId             alternative value_answer                                          
      <chr>                  <fct>       <chr>                                                 
    1 "Response ID"          Q9          "Regarding the use of social media, which of the foll~
    2 "{\"ImportId\":\"_rec~ Q9          "{\"ImportId\":\"QID12\"}"                            
    3 "R_2V7lrA7n29xU0i6"    Q9          "Using Facebook on PC for utility"                    
    4 "R_3nozPOTbJBE1OBa"    Q9          "Using Facebook on PC for utility"                    
    5 "R_2TB0WwyWCugTyEg"    Q9          "Using Youtube on mobile for entertainment"           
    6 "R_2woFtS93jHyiv8F"    Q9          "Using Youtube on mobile for entertainment"     
    

    I hope this might help you. Best!