Search code examples
rreshape

R - How to reshape multiple rows into single row with same column structure


I am trying to reshape my dataframe which has multiple rows of data, into a single row with dupliocated columns. Here is my example:

df <- data.frame (Variable1  = c("A", "A","A","A","A", "B", "B","B","B","B" ),
                  startup = c(100, 200, 300, 400, 500, 300, 400, 500, 600, 700),
                  end = c(110, 210, 310, 410, 510, 310, 410, 510, 610, 710)
                  
)

> df
   Variable1 startup end
1          A     100 110
2          A     200 210
3          A     300 310
4          A     400 410
5          A     500 510
6          B     300 310
7          B     400 410
8          B     500 510
9          B     600 610
10         B     700 710

which becomes

> df1  
  Variable1 startup1 end1 startup2 end2 startup3 end3 startup4 end4 startup5 end5
1         A      100  110      200  210      300  310      400  410      500  510
2         B      300  310      400  410      500  510      600  610      700  710

I've tried a few functions such as dcast() but that creates a column structure that is wrong and melt() which is not the right direction for reshaping the data.

Any help is welcome, ideally without having to create a loop that would assign a row to a new column set? Many thanks!


Solution

  • By first adding a column contains the row number and afterwards using tidyr::pivot_wider you could do:

    library(dplyr, warn = FALSE)
    library(tidyr)
    
    df |>
      mutate(row = row_number()) |>
      pivot_wider(names_from = row, values_from = c(startup, end), names_sep = "")
    #> # A tibble: 1 × 11
    #>   Variable1 startup1 startup2 startup3 startup4 startup5  end1  end2  end3  end4
    #>   <chr>        <dbl>    <dbl>    <dbl>    <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>
    #> 1 A              100      200      300      400      500   110   210   310   410
    #> # ℹ 1 more variable: end5 <dbl>
    

    EDIT In case of multiple rows group_by(Variable1) before adding the "row" identifiers.

    df <- data.frame(
      Variable1 = c("A", "A", "A", "A", "A", "B", "B", "B", "B", "B"),
      startup = c(100, 200, 300, 400, 500, 300, 400, 500, 600, 700),
      end = c(110, 210, 310, 410, 510, 310, 410, 510, 610, 710)
    )
    
    df |>
      group_by(Variable1) |>
      mutate(row = row_number()) |>
      ungroup() |>
      pivot_wider(names_from = row, values_from = c(startup, end), names_sep = "")
    #> # A tibble: 2 × 11
    #>   Variable1 startup1 startup2 startup3 startup4 startup5  end1  end2  end3  end4
    #>   <chr>        <dbl>    <dbl>    <dbl>    <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>
    #> 1 A              100      200      300      400      500   110   210   310   410
    #> 2 B              300      400      500      600      700   310   410   510   610
    #> # ℹ 1 more variable: end5 <dbl>