I am trying to reshape my dataframe which has multiple rows of data, into a single row with dupliocated columns. Here is my example:
df <- data.frame (Variable1 = c("A", "A","A","A","A", "B", "B","B","B","B" ),
startup = c(100, 200, 300, 400, 500, 300, 400, 500, 600, 700),
end = c(110, 210, 310, 410, 510, 310, 410, 510, 610, 710)
)
> df
Variable1 startup end
1 A 100 110
2 A 200 210
3 A 300 310
4 A 400 410
5 A 500 510
6 B 300 310
7 B 400 410
8 B 500 510
9 B 600 610
10 B 700 710
which becomes
> df1
Variable1 startup1 end1 startup2 end2 startup3 end3 startup4 end4 startup5 end5
1 A 100 110 200 210 300 310 400 410 500 510
2 B 300 310 400 410 500 510 600 610 700 710
I've tried a few functions such as dcast() but that creates a column structure that is wrong and melt() which is not the right direction for reshaping the data.
Any help is welcome, ideally without having to create a loop that would assign a row to a new column set? Many thanks!
By first adding a column contains the row number and afterwards using tidyr::pivot_wider
you could do:
library(dplyr, warn = FALSE)
library(tidyr)
df |>
mutate(row = row_number()) |>
pivot_wider(names_from = row, values_from = c(startup, end), names_sep = "")
#> # A tibble: 1 × 11
#> Variable1 startup1 startup2 startup3 startup4 startup5 end1 end2 end3 end4
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A 100 200 300 400 500 110 210 310 410
#> # ℹ 1 more variable: end5 <dbl>
EDIT In case of multiple rows group_by(Variable1)
before adding the "row" identifiers.
df <- data.frame(
Variable1 = c("A", "A", "A", "A", "A", "B", "B", "B", "B", "B"),
startup = c(100, 200, 300, 400, 500, 300, 400, 500, 600, 700),
end = c(110, 210, 310, 410, 510, 310, 410, 510, 610, 710)
)
df |>
group_by(Variable1) |>
mutate(row = row_number()) |>
ungroup() |>
pivot_wider(names_from = row, values_from = c(startup, end), names_sep = "")
#> # A tibble: 2 × 11
#> Variable1 startup1 startup2 startup3 startup4 startup5 end1 end2 end3 end4
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 A 100 200 300 400 500 110 210 310 410
#> 2 B 300 400 500 600 700 310 410 510 610
#> # ℹ 1 more variable: end5 <dbl>