I want combine the values of columns into a particular set of rows based on a shared row identifier. Here is a screenshot example of the data in Excel (left is current state and right is how I want it to turn out) screenshot of data with the same information below (top is current state and bottom is how I want it to turn out.
id | code | group | minutes | minutes 1 | minutes 2 | minutes 3 |
---|---|---|---|---|---|---|
1e | x222 | 1 | 4.5 | 4.5 | NA | NA |
1e | x124 | 1 | 4.5 | 4.5 | NA | NA |
1e | x143 | 2 | 6.7 | NA | 6.7 | NA |
1e | x123 | 2 | 6.7 | NA | 6.7 | NA |
1e | x123 | 2 | 6.7 | NA | 6.7 | NA |
1e | x143 | 3 | 8.9 | NA | NA | 8.9 |
1e | x123 | 3 | 8.9 | NA | NA | 8.9 |
id | code | group | minutes 1 | minutes 2 | minutes 3 |
---|---|---|---|---|---|
1e | x222 | 1 | 4.5 | 6.7 | 8.9 |
1e | x124 | 1 | 4.5 | 6.7 | 8.9 |
I want to find a way to fill the NAs (for minutes 2 and 3) in the group 1 rows with the values for minutes 2 and 3 when they share an id (and drop the rows for groups 2 and 3). There may not necessarily be a group 2 or 3 for an id, and the number of codes vary for group 1.
I have tried reshape(), pivot_wider() (on the minutes column), and a few other things, but it hasn't worked for me yet. I feel like the solution is simple but for some reason I am stuck. Any advice would be appreciated! I would prefer to do this in dplyr/the tidyverse if possible.
We could use fill
from tidyr
to replace the NAs with previous or next non-NAs after grouping
library(dplyr)
library(tidyr)
df1 %>%
group_by(id) %>%
fill(starts_with("minutes"), .direction = "downup") %>%
ungroup