I'm trying to write a function that uses pivot_longer, and would like to use my function object as the object of the names_to argument in pivot_longer.
record <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
x214532 <- c("shirts, shoes",
"shoes, purses, hats",
"shirts, shoes, hats, heavy machinery",
"sponges, shoes",
"hats, heavy machinery",
"",
"heavy machinery, purses, shirts",
"heavy machinery, shoes, sponges",
"sponges",
"shoes")
screening_data_responses_char <- data.frame(record, x214532)
record x214532
1 1 shirts, shoes
2 2 shoes, purses, hats
3 3 shirts, shoes, hats, heavy machinery
4 4 sponges, shoes
5 5 hats, heavy machinery
6 6
7 7 heavy machinery, purses, shirts
8 8 heavy machinery, shoes, sponges
9 9 sponges
10 10 shoes
Ultimately I'm trying to un-concatenate column x214532 and create a long dataset that separates the data into the items listed in the columns and then creates a long dataset, like this:
record x214532
1 1 shirts
2 1 shoes
3 2 shoes
4 2 purses
5 2 hats
6 3 shirts
7 3 shoes
8 3 hats
9 3 heavy machinery
10 4 sponges
11 4 shoes
12 5 hats
13 5 heavy machinery
14 6
15 7 heavy machinery
16 7 purses
17 7 shirts
18 8 heavy machinery
19 8 shoes
20 8 sponges
21 9 sponges
22 10 shoes
I'd like the column containing the data to still be called x214532, but I'm having trouble passing it through names_to of pivot_longer. Here's what I've got:
remove_col_prefix <- function(x) {
pattern <- "^[^_]+_"
stringr::str_remove(x, pattern)
}
deconcatenate <- function(questionID) {
screening_data_responses_questionID <- cSplit_e(data=screening_data_responses_char,split.col=questionID,sep=",",type="character")
screening_data_responses_questionID <- screening_data_responses_questionID %>%
select(-questionID) %>%
pivot_longer(cols=c(starts_with(questionID)),
names_to="questionID",
values_to="questionID_resp") %>%
drop_na(questionID_resp) %>%
select(-questionID_resp) %>%
mutate(questionID=remove_col_prefix(questionID)) %>%
select(c(deid_pat_id, questionID))
screening_data_responses_char <- screening_data_responses_char %>%
select(-questionID)
screening_data_responses_char <-merge(screening_data_responses_char,screening_data_responses_questionID,by="deid_pat_id",all=TRUE)
}
screening_data_responses_char <- deconcatenate(questionID="x214532")
Things I've tried:
-{{}} and !! operators (Using function arguments as column names)
-enquo
-this cement function: (https://adv-r.hadley.nz/quasiquotation.html)
-deparse(subsitute(x))
Things I've gotten:
-The column disappears entirely from the output
-The column is called questionID instead of x214532
-The column is called questionID and all of the text turns into x214532
I'm sure there are a few things that I'm doing wrong, or possibly something that I've gotten right in pivot_longer but need to also change further down the syntax, but I can't quite figure it out. Any assistance would be appreciated!
A much easier approach may be to use tidyr::separate_longer_delim()
instead of pivot_longer()
and a complicated pipe:
tidyr::separate_longer_delim(data = screening_data_responses_char,
cols = x214532,
delim = ",")
Output:
record x214532
1 1 shirts
2 1 shoes
3 2 shoes
4 2 purses
5 2 hats
6 3 shirts
7 3 shoes
8 3 hats
9 3 heavy machinery
10 4 sponges
11 4 shoes
12 5 hats
13 5 heavy machinery
14 6
15 7 heavy machinery
16 7 purses
17 7 shirts
18 8 heavy machinery
19 8 shoes
20 8 sponges
21 9 sponges
22 10 shoes