I have a number of columns in a data frame that represent replicates of an experimental result.
Example here
1a 2a 3a 4a 5a
1 154 152 154 156 NA
2 154 154 154 NA NA
3 154 154 154 154 NA
4 154 154 154 154 NA
5 154 NA 154 154 NA
6 NA NA NA 154 NA
7 154 154 NA 154 NA
8 154 154 NA 154 NA
9 154 NA 154 150 NA
10 149 149 NA 149 149
What I would like is to create another column which has the value that occurs(>=2)from each of the other columns.
1a 2a 3a 4a 5a score
1 154 152 154 156 NA 154
2 154 154 154 NA NA 154
3 154 154 154 154 NA 154
4 154 154 154 154 NA 154
5 154 NA 154 154 NA 154
6 NA NA NA 154 NA NA
7 154 154 NA 154 NA 154
8 154 154 NA 154 NA 154
9 154 NA 154 150 NA 154
10 149 149 NA 149 149 149
EDIT: Modified example above to demonstrate. flodel's answer of using the mode was initially successful however it would use a value even if it only occurred once. I would like it to either come up NA or a character string (which ever is easier)if there are not 2>x values in each row.
You are not looking for the median but the mode, which is easy enough to define yourself:
Mode <- function(x, min.freq = 1L) {
f <- table(x)
k <- f[f >= min.freq]
if (length(k) > 0L) as.numeric(names(f)[which.max(f)]) else NA
}
test$score <- apply(test2, 1, Mode, min.freq = 2L)