I need to turn long data into wide data, but with new numbered columns (col1, col2, etc.) created as needed. (In other words, the col names are not taken from keys, but generated generically.) So from something like this:
d <- data.frame(row_id = c(1,1,2,2,2,3,3,3),cat = c("a","b","a","d","e","d","f","g"))
row_id cat
1 1 a
2 1 b
3 2 a
4 2 d
5 2 e
6 3 d
7 3 f
8 3 g
To something like this:
row_id,cat1,cat2,cat3
1,a,b,,
2,a,d,e
3,d,f,g
spread() and pivot_wider() don't really fit the bill, since I'm working with categorical values and a large number of categories (but no more than 8 per id). In python, I might iterate through the rows, appending new key:value pairs ("cat"+i:"a")to a dictionary within a list (or something), but not sure what the analogous method would be in R or if there's a better approach altogether.
We create a sequence column by 'row_id' and then pivot to wide with pivot_wider
library(dplyr)
library(tidyr)
library(stringr)
library(data.table)
d %>%
mutate(rn = str_c('cat', rowid(row_id))) %>%
pivot_wider(names_from = rn, values_from = cat)
-output
# A tibble: 3 x 4
# row_id cat1 cat2 cat3
# <dbl> <chr> <chr> <chr>
#1 1 a b <NA>
#2 2 a d e
#3 3 d f g