I need to know which elements of the row are unique for each column in data.frame and then print rownames in output.
My data example:
id A B C
s1 1 2 1
s2 1 0 0
s3 0 12 3
s4 0 1 0
s5 0 1 0
I'd like to get simething like this:
$A s2
$B s4,s5
$C NA
Which means that:
A has only one unique element - s2
B has two unique elements - s4 and s5
and C has not any unique elements ,so it's filled by NA
I've tried
apply(data, 2, function(x) unique(x))
but it's not what I need..
Thanks a lot for suggestions!
Here is a rough base R
solution:
helper <- function(x) {
has_p <- x > 0
if (sum(has_p) != 1) has_p[] <- FALSE
has_p
}
step1 <- as.data.frame(t(apply(df[-1], 1, helper)))
lapply(step1, function(x) df[[1]][x])
$A
[1] "s2"
$B
[1] "s4" "s5"
$C
character(0)
Edit
Here is a much simpler logic for the same solution:
rows <- rowSums(df[-1] > 0) == 1
lapply(df[-1], function(x) df[["id"]][rows & x > 0])
Edit 2
Put into one step (and add correct output NA
when nothing unique):
lapply(
as.data.frame(df[-1] > 0 & rowSums(df[-1] > 0) == 1),
function(x) {
if (all(!x)) return(NA)
df[["id"]][x]
}
)
Data
df <- structure(list(id = c("s1", "s2", "s3", "s4", "s5"), A = c(1L,
1L, 0L, 0L, 0L), B = c(2L, 0L, 12L, 1L, 1L), C = c(1L, 0L, 3L,
0L, 0L)), row.names = c(NA, -5L), class = "data.frame")