I have a dataframe where one column contains an ID, and the other column contains a series of words with numbers associated with them. As and example:
ColA ColB
A Banana:2;Kiwi:4;Apple:1
B Kiwi:1;Orange:4
I would like to duplicate each of the words in ColB by the number associated with it. So, the output I'm looking for would be:
ColA ColB
A Banana;Banana;Kiwi;Kiwi;Kiwi;Kiwi;Apple
B Kiwi;Orange;Orange;Orange;Orange
Any suggestions or solutions would be greatly appreciated! Thanks
cbind(df, C = sapply(strsplit(df$ColB, ';'),
\(x)paste(do.call(rep, unname(read.table(text=x, sep=':'))), collapse = ';')))
ColA ColB C
1 A Banana:2;Kiwi:4;Apple:1 Banana;Banana;Kiwi;Kiwi;Kiwi;Kiwi;Apple
2 B Kiwi:1;Orange:4 Kiwi;Orange;Orange;Orange;Orange
df %>%
separate_longer_delim(ColB, delim = ';')%>%
separate_wider_delim(ColB, names = c('x','times'), delim = ':')%>%
uncount(as.numeric(times))%>%
summarise(ColB = str_c(x, collapse = ';'), .by = ColA)
# A tibble: 2 × 2
ColA ColB
<chr> <chr>
1 A Banana;Banana;Kiwi;Kiwi;Kiwi;Kiwi;Apple
2 B Kiwi;Orange;Orange;Orange;Orange
transform(df, ColB = gsubfn::gsubfn("(\\w+):(\\d+)", \(x,y)paste(rep(x,as.numeric(y)), collapse = ";"), ColB))
ColA ColB
1 A Banana;Banana;Kiwi;Kiwi;Kiwi;Kiwi;Apple
2 B Kiwi;Orange;Orange;Orange;Orange