Search code examples
rdataframecomparison

R - Compare all elements in a dataframe column with an element in a row in another dataframe, across all columns


Example data :

df1 = data.frame('A' = seq(1,10), 'B' = seq(11,20), 'C' = seq(21,30))
df1
#     A  B  C
# 1   1 11 21
# 2   2 12 22
# 3   3 13 23
# 4   4 14 24
# 5   5 15 25
# 6   6 16 26
# 7   7 17 27
# 8   8 18 28
# 9   9 19 29
# 10 10 20 30

df2 = data.frame('A'=quantile(df1$A, probs=c(0.25,0.5,0.75)),
                 'B'=quantile(df1$B, probs=c(0.25,0.5,0.75)),
                 'C'=quantile(df1$C, probs=c(0.25,0.5,0.75)))
df2
#        A     B     C
# 25% 3.25 13.25 23.25
# 50% 5.50 15.50 25.50
# 75% 7.75 17.75 27.75

df3 = data.frame('A'=c(5,8,6,2,1), 'B'=c(11,12,13,19,20), 'C'=c(21,27,24,26,25))
df3
#   A  B  C
# 1 5 11 21
# 2 8 12 27
# 3 6 13 24
# 4 2 19 26
# 5 1 20 25

Desired result:

#     A B C
# Q1  3 2 4
# Med 2 2 2
# Q3  1 2 0

I want to get the no of elements in each column of df3 that are

  • i) < df2['25%,] (also this, < in this row, > in others)
  • ii) > df2['50%',]
  • iii) > df2['75%',]

These results to be displayed in 3 rows as separate dataframe.


Solution

  • You can use sapply in a mapply call :

    mapply(function(x, y) colSums(sapply(x, `<`, y)), df2, df3)
    
    #     A B C
    #[1,] 3 2 4
    #[2,] 2 2 2
    #[3,] 1 2 0