I have a dataset like this:
data <- data.frame(Time = c(1,4,6,9,11,13,16, 25, 32, 65),
A = c(10, NA, 13, 2, 32, 19, 32, 34, 93, 12),
B = c(1, 99, 32, 31, 12, 13, NA, 13, NA, NA),
C = c(2, 32, NA, NA, NA, NA, NA, NA, NA, NA))
What I want to retrieve are the values in Time
that corresponds to the last numerical value in A
, B
, and C
.
For example, the last numerical values for A
, B
, and C
are 12, 13, and 32 respectively.
So, the Time
values that correspond are 65, 25, and 4.
I've tried something like data[which(data$Time== max(data$A)), ]
, but this doesn't work.
We can multiply the row
index with the logical matrix
, and get the colMaxs
(from matrixStats
) to subset the 'Time' column
library(matrixStats)
data$Time[colMaxs((!is.na(data[-1])) * row(data[-1]))]
#[1] 65 25 4
Or using base R
, we get the index with which/arr.ind
, get the max
index using a group by operation (tapply
) and use that to extract the 'Time' value
m1 <- which(!is.na(data[-1]), arr.ind = TRUE)
data$Time[tapply(m1[,1], m1[,2], FUN = max)]
#[1] 65 25 4
Or with summarise/across
in the devel version of dplyr
library(dplyr)
data %>%
summarise(across(A:C, ~ tail(Time[!is.na(.)], 1)))
# A B C
#1 65 25 4
Or using summarise_at
with the current version of dplyr
data %>%
summarise_at(vars(A:C), ~ tail(Time[!is.na(.)], 1))