I have the below dataframe, where each observation has a string of key and values (k:v)s of different objects. Each object is separated by a \n
df = tribble(
~col, ~summary,
"a", "Oranges: 1\nApples: 1\nPeaches: 6\nBananas: 2",
"b", "Apples: 2\nBananas: 1",
"c", "Oranges: 4\nPeaches: 2\nBananas: 5",
"d", "Peaches: 2"
)
I have separated those values as below:
df_sep = df %>%
separate(summary,c("col1","col2","col3","col4"),sep = "\n")
I would like to have the below output using dplyr possibly with across
and any
if applciable.
# A tibble: 4 x 10
col summary Apples_flag Bananas_flag oranges_flag Peaches_flag_value Apples_value Bananas_value oranges_value Peaches_value
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 b Apples: 2, Bananas: 1 1 1 0 0 2 1 0 0
2 a Oranges: 1, Apples: 1, Peaches: 6, Bananas: 2 1 1 1 1 1 2 1 6
3 c Oranges: 4, Peaches: 2, Bananas: 5 0 1 1 1 0 5 4 2
4 d Peaches: 2 0 0 0 1 0 0 0 2
Another way would be:
df %>%
left_join(df%>%
separate_rows(summary, sep='\n') %>%
separate(summary, c('name', 'value'), convert = TRUE) %>%
pivot_wider(col,names_glue = '{name}_value', values_fill = 0) %>%
mutate(across(-col, sign, .names = '{.col}_flag')))