I have a set of data like below:
BETA_LACT R I S
- 23 25 91
- 30 0 109
- 0 0 136
+ 73 0 0
+ 14 0 59
+ 0 0 49
I want to convert the data to the format below:
R_- I_- S_- R_+ I_+ S_+
23 25 91 73 0 0
30 0 109 14 0 59
0 0 136 0 0 49
I tried spread() but failed, could anybody help me?
I suspect your problem using spread
and gather
is that there is nothing your sample data to suggest which rows should be collapsed. As a human, I can observe that you wish to combine rows 1 and 4, 2 and 5, etc. However, there are no other columns or "keys" persay in your dataset to indicate this.
One solution would be to add an index column as I show in the second example below using group_by
and mutate
. The following reprex
(reproducible example) shows both a non-working example analogous to your case and a working example.
library(tidyr)
library(dplyr)
example_data <- data.frame(
categ = rep(1:3, 3),
x = 1:9,
y = 11:19,
z = 21:29
)
# won't work
example_data %>%
gather(var, value, -categ) %>%
unite(new_col_name, var, categ) %>%
spread(new_col_name, value)
#> Error: Duplicate identifiers for rows (1, 4, 7), (2, 5, 8), (3, 6, 9), (10, 13, 16), (11, 14, 17), (12, 15, 18), (19, 22, 25), (20, 23, 26), (21, 24, 27)
# will work
example_data %>%
group_by(categ) %>%
mutate(id = row_number()) %>%
gather(var, value, -categ, -id) %>%
unite(new_col_name, var, categ) %>%
spread(new_col_name, value)
#> # A tibble: 3 x 10
#> id x_1 x_2 x_3 y_1 y_2 y_3 z_1 z_2 z_3
#> * <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#> 1 1 1 2 3 11 12 13 21 22 23
#> 2 2 4 5 6 14 15 16 24 25 26
#> 3 3 7 8 9 17 18 19 27 28 29
(As a sidenote, please check out the reprex
package! This helps you make a "reproducible example" and ask better questions which will facilitate better community support. Notice how easy it is to copy the above code and run it locally.)