I have follow up data for different people, for example for one guy if i have 10 observations, his name will be only on his first row, the 9 following rows will not have name.
My goal is to fill the name
column
Here is a reproducible example of my data:
test = data.frame(name = c("Paul",NA,NA,"John",NA,"Ethan",NA,NA),
date = c("2016-05-06","2017-05-06","2018-05-06","2012-08-09","2016-02-01","2017-06-06","2017-07-06","2017-08-06"),
data = c(1,2,1,NA,2,2,NA,2))
That is how the data looks like :
name date data
1 Paul 2016-05-06 1
2 <NA> 2017-05-06 2
3 <NA> 2018-05-06 1
4 John 2012-08-09 NA
5 <NA> 2016-02-01 2
6 Ethan 2017-06-06 2
7 <NA> 2017-07-06 NA
8 <NA> 2017-08-06 2
And my goal is to have that :
name date data
1 Paul 2016-05-06 1
2 Paul 2017-05-06 2
3 Paul 2018-05-06 1
4 John 2012-08-09 NA
5 John 2016-02-01 2
6 Ethan 2017-06-06 2
7 Ethan 2017-07-06 NA
8 Ethan 2017-08-06 2
I did not find any function that can replace until the next not NA observation, and for information the data is sorted by person and by date.
One option would be tidyr::fill
:
test = data.frame(name = c("Paul",NA,NA,"John",NA,"Ethan",NA,NA),
date = c("2016-05-06","2017-05-06","2018-05-06","2012-08-09","2016-02-01","2017-06-06","2017-07-06","2017-08-06"),
data = c(1,2,1,NA,2,2,NA,2))
tidyr::fill(test, name)
#> name date data
#> 1 Paul 2016-05-06 1
#> 2 Paul 2017-05-06 2
#> 3 Paul 2018-05-06 1
#> 4 John 2012-08-09 NA
#> 5 John 2016-02-01 2
#> 6 Ethan 2017-06-06 2
#> 7 Ethan 2017-07-06 NA
#> 8 Ethan 2017-08-06 2