Note that there are solutions to other questions that may resolve this specific question, such as Replace NA with previous or next value, by group, using dplyr. However, this question isn't about replacing NA's, NA's are OK in this question in certain circumstances. This question addresses replacing all cells in a group in a dataframe that fall after the the first non-NA in that group, with that first non-NA value. When I researched this issue I didn't find solutions that fit because I only want to replace NA's in certain circumstances (NA's in a group that occur prior to the first non-NA in that group remain; and a group with all NA's and no non-NA in that group retain all their NA's).
Is there a method, with a preference for dplyr or data.table, for extending target values down an R dataframe range when specified row conditions are met in a row within a group? I vaguely remember an rleid
function in data.table that may do the trick but I'm having trouble implementing. Either as a new column or by over-writing existing column "State" in my below example.
For example, if we start with the below example dataframe, I'd like to send the target value of 1 in each row to the end of each ID grouping, after the first occurrence of that target value of 1 in a group, and as better explained in the illustration underneath:
myDF <- data.frame(
ID = c(1,1,1,1,2,2,2,3,3,3),
State = c(NA,NA,1,NA,1,NA,NA,NA,NA,NA))
You can use fill
:
library(tidyr)
myDF %>%
group_by(ID) %>%
fill(State, .direction = "down")
# A tibble: 10 × 2
# Groups: ID [3]
ID State
<dbl> <dbl>
1 1 NA
2 1 NA
3 1 1
4 1 1
5 2 1
6 2 1
7 2 1
8 3 NA
9 3 NA
10 3 NA