Search code examples
rdata-manipulationdplyr

How to get value of last non-NA column


A bit difficult to explain, but I have a dataframe with values that look like a staircase - for every date, there are different columns that have NA for some dates. I want to create a new column that has the last non-NA column value in it.

Hopefuly it makes more sense with this example:

Sample dataframe:

test <- data.frame("date" = c(as.Date("2020-01-01"), as.Date("2020-01-02"), as.Date("2020-01-03")),
                   "a" = c(4, 3, 4),
                   "b" = c(NA, 2, 1),
                   "c" = c(NA, NA, 5))

Desired output:

date............val
2020-01-01...... 4
2020-01-02...... 2
2020-01-03...... 5

I'd also prefer not to do something like take the row number of the date and take that column number + 1, but if that's the only way to do it, that's that. Thanks!


Solution

  • You can use max.col with ties.method set as "last" to get last non-NA value in each row.

    test$val <- test[cbind(1:nrow(test), max.col(!is.na(test), ties.method = 'last'))]
    test
    
    #        date a  b  c val
    #1 2020-01-01 4 NA NA   4
    #2 2020-01-02 3  2 NA   2
    #3 2020-01-03 4  1  5   5