snp1 <- c("AA", "AT", "AA", "TT", "AA", "AT", "AA", "AA", "AA", "AT")
snp2 <- c("GG", "GC", "GG", "CC", "CC", "GC", "GG", "GG", "GG", "GC")
df1 <- data.frame(snp1, snp2)
num1 <- c(1, 2, 1, 3, 1, 2, 1, 1, 1, 2)
num2 <- c(1, 2, 1, 3, 3, 2, 1, 1, 1, 2)
df2 <- data.frame(num1, num2)
This is done in R. I have an object df1, which I want to convert to df2. For each column in df1, the most common value is converted to 1, the second most common value to 2, etcetera. How do I do this efficiently?
Variation on a theme:
lapply(df1, function(x) match(x, levels(x)[order(-table(x))]) )
#$snp1
# [1] 1 2 1 3 1 2 1 1 1 2
#
#$snp2
# [1] 1 2 1 3 3 2 1 1 1 2