Search code examples
rmultiple-columnsreshapetidyr

Long to wide data


I have been trying to reshape some data from long into the wide format. I am interested in having one row per unique ID. To mimic my requirements, I have created a sample input and the desired output as below:

Input:

id   date   size    category    name    type
124  3.1      1     fruit       apple   royalGala
327  1.1      0     veg         chilli  green
124  2.1      2     fruit       apple   green
124  1.2      1     fruit       apple   jazz
124  2.2      2     fruit       apple   eve
124  2.1      3     fruit       apple   pinkLady
327  1.2      1     veg         chilli  red
327  1.2      2     veg         chilli  Jalapeño
327  1.2      3     veg         chilli  bananaPepper
327  3.3      1     veg         chilli  fresnoPepper

Output:

id  fruit_apple_royalGala_date  fruit_apple_royalGala_size  fruit_apple_green_date  fruit_apple_green_size  fruit_apple_jazz_date   fruit_apple_jazz_size   fruit_apple_eve_date    fruit_apple_eve_size    fruit_apple_pinkLady_date   fruit_apple_pinkLady_size   veg_chilli_green_date   veg_chilli_green_size   veg_chilli_red_date veg_chilli_red_size veg_chilli_Jalapeño_date    veg_chilli_Jalapeño_size    veg_chilli_bananaPepper_date    veg_chilli_bananaPepper_size    veg_chilli_fresnoPepper_date    veg_chilli_fresnoPepper_size
124 3.1                         1                           2.1                     2                       1.2                     1                       2.2                     2                       2.1                         3                           NA                      NA                      NA                  NA                  NA                          NA                          NA                              NA                              NA                              NA
327 NA                          NA                          NA                      NA                      NA                      NA                      NA                      NA                      NA                          NA                          1.1                     0                       1.2                 1                   1.2                         2                           1.2                             3                               3.3                             1

I am unsure how to achieve the desired output. I have looked at some relevant questions at StackOverflow but none of them help me solve this problem e.g., Convert data from long format to wide format with multiple measure columns, From long to wide data with multiple columns and Reshape multiple value columns to wide format.

I have been working on this problem since yesterday but having very little experience with gather and spread etc., I have not been able to solve it. I would greatly appreciate any assistance on this.

Thank you!

dput()

For convenience, I am also copying dput()

structure(list(
id = c(124L, 327L, 124L, 124L, 124L, 124L, 327L, 327L, 327L, 327L), 
date = c(3.1, 1.1, 2.1, 1.2, 2.2, 2.1, 1.2, 1.2, 1.2, 3.3), 
size = c(1L, 0L, 2L, 1L, 2L, 3L, 1L, 2L, 3L, 1L), 
category = c("fruit", "veg", "fruit", "fruit", "fruit", "fruit", "veg", "veg", "veg", "veg"), 
name = c("apple", "chilli", "apple", "apple", "apple", "apple", "chilli", "chilli", "chilli", "chilli"), 
type = c("royalGala", "green", "green", "jazz", "eve", "pinkLady", "red", "Jalapeño", "bananaPepper", "fresnoPepper")), 
class = "data.frame", row.names = c(NA, -10L))

Partial Solution

I have a work around to solve this problem but my solution fails when I run it on my original data.

# Read the csv file
df = read.csv("C:/Desktop/test.csv")

# Unite multiple columns in to one
df_unite = df %>% 
  unite("info", category:type, remove = TRUE)

# Conversion from long into wide format
setDT(df_unite)   # coerce to data.table
df_wide <- dcast(df_unite, id ~ info, 
                value.var = c("date", "size"))

Solution

  • df %>%
      pivot_wider(id, c(category, name, type), 
                values_from = c(date, size),
                names_glue = '{category}_{name}_{type}_{.value}')