I have a series of columns in a data.frame
of which I'd like to get the last value, excluding any NAs. The function I'm using to get this done is
last_value <- function(x) tail(x[!is.na(x)], 1)
I'm using apply()
to work this function across the 13 columns, for each observation (by row).
df$LastVal<-apply(df[,c(116, 561, 1006, 1451, 1896, 2341, 2786, 3231,
3676, 4121, 4566, 5011, 5456)], 1, FUN=last_value)
My problem is that the output comes out as a list of 5336 (total observations), instead of just a vector of the last values by row. The answers seem to be there but again, in list form. I've used this function before and it's worked fine. When I str()
my columns, they're all integers.
Could this function get tripped up if there are no values and only NAs?
I should add that when I unlist()
the new variable, I get an error that says "replacement has 4649 rows, data has 5336", so I do think this might have something to do with NAs.
First, you need to see what is the output of the function last_value
as you have defined it with a row of NA
values.
last_value <- function(x) tail(x[!is.na(x)], 1)
df <- matrix(1:24, 4)
df[2, ] <- NA
df <- as.data.frame(df)
apply(df, 1, last_value)
#[[1]]
#V6
#21
#
#[[2]]
#named integer(0)
#
#[[3]]
#V6
#23
#
#[[4]]
#V6
#24
The problem is that the second member of this list is of length zero. This means that unlist
will not solve the problem.
You have to test for a value of length zero.
last_value <- function(x) {
y <- tail(x[!is.na(x)], 1)
if(length(y) == 0) NA else y
}
apply(df, 1, last_value)
#[1] 21 NA 23 24