I want to restructure my data set. Therefore, I need kind of a restructuring from long to wide. The difficulty for me is that I do already have something like a wide format which I would like to make even wider. But therefore I could not find any comparable posts to do this restructuring process.
So this is my data set as it looks like at the moment:
or shown with str() function:
Classes ‘data.table’ and 'data.frame': 1651 obs. of 13 variables:
$ passcode : chr "AN04AD" "AN04AD" "AN04AD" "AN04AD" ...
$ question_id : num 1 2 3 4 5 6 7 8 9 10 ...
$ question_type: chr "TrueOrFalse" "TrueOrFalse" "TrueOrFalse" "TrueOrFalse" ...
$ option_1 : num 1 1 1 1 1 0 NA 0 1 0 ...
$ option_2 : num 0 0 0 0 1 0 NA 1 0 1 ...
$ option_3 : num 0 0 0 0 1 0 NA 1 0 1 ...
$ option_4 : num 0 0 0 0 2 1 NA 0 1 0 ...
$ option_5 : num 0 0 0 0 2 0 NA 0 0 0 ...
$ option_6 : num 0 0 0 0 1 0 NA 0 0 0 ...
$ option_7 : num NA NA NA NA 2 NA NA NA NA NA ...
$ option_8 : num NA NA NA NA 1 NA NA NA NA NA ...
$ created_at : POSIXct, format: "2021-06-03 18:28:16" "2021-06-03 18:28:16" "2021-06-03 18:28:16" "2021-06-03 18:28:16" ...
$ updated_at : POSIXct, format: NA NA NA NA ..
After restructuring it should look like:
This means for each person (passcode) I just need one row in the data set. Overall, I have 11 items (question_id) and 1529 rows what make 139 different passcodes. The Items (question_id) vary in their number of answer options but the maximum of these answer options is 8 presented answers. The Item 1 (question_id = 1), e.g. has just 6 answer options why (after the restructuring process) the new variables "question1_option7" and "question1_option8" has just NAs. During the restructuring process, I would like the "option_x"-variables to be renames like: question1_option1, question1_option2 and so on.
This can be done with pivot_wider()
from dplyr
df <- tibble(passcode = rep(LETTERS[1:10], each = 2),
question_id = rep(1:2, times = 10),
questionType = "TrueOrFalse",
option_1 = round(runif(min = 0, max = 3, 20)),
option_2 = round(runif(20)))
df %>% pivot_wider(names_from = 'question_id',
values_from = c('option_1', 'option_2'),
id_cols = 'passcode')