Search code examples
rdplyrlong-format-datawide-format-data

R: Long to Wide dataframe with concatenation of column values into one


I have a dataframe with the following style.

df = data.frame(Process = rep("Global", 6),
                Category = c("Category_1", "Category_1", "Category_2", "Category_2", "Category_3", "Category_3"),
                Metric = c("Metric_1_1", "Metric_1_2", "Metric_2_1", "Metric_2_2", "Metric_3_1", "Metric_3_2"),
                level_1 = c(0.50, 0.50, 0.50, 0.50, 0.50, 0.50),
                level_2 = c(0.75, 0.50, 0.50, 0.50, 0.50, 0.50))

I would like to convert it to the following dataframe.

df_wide = data.frame(Global_Category_1_Metric_1_1_Level_1 = c(0.5),
                     Global_Category_1_Metric_1_1_Level_2 = c(0.75),
                     Global_Category_1_Metric_1_2_Level_1 = c(0.5),
                     Global_Category_1_Metric_1_2_Level_2 = c(0.5),
                     Global_Category_2_Metric_2_1_Level_1 = c(0.5),
                     Global_Category_2_Metric_2_1_Level_2 = c(0.5),
                     Global_Category_1_Metric_2_2_Level_1 = c(0.5),
                     Global_Category_1_Metric_2_2_Level_2 = c(0.5),
                     Global_Category_3_Metric_3_1_Level_1 = c(0.5),
                     Global_Category_3_Metric_3_1_Level_2 = c(0.5),
                     Global_Category_1_Metric_3_2_Level_1 = c(0.5),
                     Global_Category_1_Metric_3_2_Level_2 = c(0.5))

Basically, I want to concatenate the values of the Process, Category, Metric columns and combine it with the level_1 and level_2 columns to create one column for each "new" metric.

Ideally, I would like for the new name to have colons between the original values, i.e., Global:Category_1:Metric_1_1:Level_1.

I'm not sure how to achieve that with dplyr. However, if necessary I'm not looking for a solution that exclusively uses dplyr.


Solution

  • Probably you can try

    df %>%
        pivot_longer(starts_with("level")) %>%
        pivot_wider(names_from = !value, names_sep = ":")
    

    and with str you will see

    > df %>%
    +     pivot_longer(starts_with("level")) %>%
    +     pivot_wider(names_from = !value, names_sep = ":") %>%
    +     str()
    tibble [1 × 12] (S3: tbl_df/tbl/data.frame)
     $ Global:Category_1:Metric_1_1:level_1: num 0.5
     $ Global:Category_1:Metric_1_1:level_2: num 0.75
     $ Global:Category_1:Metric_1_2:level_1: num 0.5
     $ Global:Category_1:Metric_1_2:level_2: num 0.5
     $ Global:Category_2:Metric_2_1:level_1: num 0.5
     $ Global:Category_2:Metric_2_1:level_2: num 0.5
     $ Global:Category_2:Metric_2_2:level_1: num 0.5
     $ Global:Category_2:Metric_2_2:level_2: num 0.5
     $ Global:Category_3:Metric_3_1:level_1: num 0.5
     $ Global:Category_3:Metric_3_1:level_2: num 0.5
     $ Global:Category_3:Metric_3_2:level_1: num 0.5
     $ Global:Category_3:Metric_3_2:level_2: num 0.5