I have survey data I'm working on in R that includes questions where respondents were given a list of options and asked to rank their top five. The data looks like this:
head(data)
responseid ChoiceA ChoiceB ChoiceC ChoiceD ChoiceE ChoiceF
1 001 5 2 1 NA 4 3
2 002 NA 4 3 5 2 1
3 003 3 1 NA 2 4 5
4 004 NA 5 2 1 3 4
I'd like to reshape it so that it looks like this:
head(data_new)
responseid first second third fourth fifth
1 001 C B F E A
2 002 F E C B D
3 003 B D A E F
4 004 D C E F B
What would be an efficient way to do this, preferably in tidyverse? I feel like the answer is going to involve changing the values in a new column based on the names of other columns, which I can't figure out how to do.
I'm also pulling this data out of SurveyGizmo, so if anyone has suggestions for a way to set it up so that the data exports to a csv the way I want it in the first place, I would appreciate that as well. Thanks!
The gist is that you want to pivot longer, clean it up a bit and then pivot wider using the ranks as the column names instead of the choices. This is a similar method to akrun, though I think given the limit of five choices hardcoding the ordinal column names is less complex.
library(tidyverse)
tbl <- read_table2(
"responseid ChoiceA ChoiceB ChoiceC ChoiceD ChoiceE ChoiceF
001 5 2 1 NA 4 3
002 NA 4 3 5 2 1
003 3 1 NA 2 4 5
004 NA 5 2 1 3 4"
)
tbl %>%
pivot_longer(
cols = -responseid,
names_to = "choice",
values_to = "rank",
values_drop_na = TRUE
) %>%
mutate(
choice = str_remove(choice, "Choice"),
rank = c("first", "second", "third", "fourth", "fifth")[as.integer(rank)]
) %>%
pivot_wider(names_from = rank, values_from = choice) %>%
select(responseid, first, second, third, fourth, fifth)
#> # A tibble: 4 x 6
#> responseid first second third fourth fifth
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 001 C B F E A
#> 2 002 F E C B D
#> 3 003 B D A E F
#> 4 004 D C E F B
Created on 2020-03-04 by the reprex package (v0.3.0)