I have a question regarding the na.locf
function in the zoo
package. Within the data frame below I want to remove the leading NAs (for years 1987, 1988) but keep those with a valid value for the previous year (1993).
Year X
1987 NA
1988 NA
1989 2
1990 5
1991 9
1992 16
1993 NA
1994 27
1995 36
Does anyone have a solution for this problem?
The na.locf
is designed for filling missing observations, not removing them. The zoo package also has a na.trim
function which removes leading and/or trailing observations:
na.trim(mydf)
which gives:
> na.trim(mydf)
Year X
3 1989 2
4 1990 5
5 1991 9
6 1992 16
7 1993 NA
8 1994 27
9 1995 36
With the sides
parameter you can choose whether to remove only leading or trailing missing observations or both. Using for example sides = 'right'
will only remove trailing missing observations and keep the leading missing observations:
> na.trim(mydf, sides = 'right')
Year X
1 1987 NA
2 1988 NA
3 1989 2
4 1990 5
5 1991 9
6 1992 16
7 1993 NA
8 1994 27
9 1995 36
Consequently, using sides = 'left'
will only remove leading missing observations and keep the trailing missing observations:
> na.trim(mydf, sides = 'left')
Year X
3 1989 2
4 1990 5
5 1991 9
6 1992 16
7 1993 NA
8 1994 27
9 1995 36
10 1996 NA
Used data:
mydf <- structure(list(Year = 1987:1996, X = c(NA, NA, 2L, 5L, 9L, 16L, NA, 27L, 36L, NA)),
.Names = c("Year", "X"), class = "data.frame", row.names = c(NA,-10L))