Search code examples
rdplyrquotesrlangnon-standard-evaluation

Tally across columns with variable condition in r


I am trying to tally across columns of a dataframe with values that exceed a corresponding limit variable. Here is a similar problem but for each point, the condition may change so rowSums is not an option. I am interested in either modifications to the solution I have attempted below, any more efficient methods, and any simpler methods which help to solve this issue.

library(tidyverse)

set.seed(234)

ex_dat <- tibble(a = rnorm(n = 6, mean = 1),
                 a_lim = 0.75,
                 b = rnorm(n = 6, mean = 0.5),
                 b_lim = 0.333,
                 c = rnorm(n = 6, mean = 1.5),
                 c_lim = 1.0,
                 d = rnorm(n = 6, mean = 1.5),
                 d_lim = 1.25)

ex_dat %>% 
rowwise() %>% 
mutate(tally = sum(map_lgl(.x = c("a","b","c","d"),
.f = ~(noquote(.x) > noquote(paste0(.x,"_lim")))), na.rm = T))

For instance, the desired outcome would be that the 'tally' column here reads 4 in the first row, given that all 4 conditions are met. Then, it should read 2 for the second row, given that only 2 limits were exceeded.

I have tried a few variations on this approach here, using the bang-bang operator to force-evaluate the variables in the .f argument to the map function. So far, this attempt and silent-failure seems to be the closest and most sensible as far as I can wrap my head around. Evidently, I don't have a very firm grasp on non-standard evaluation, so my attempts with !! and sym() didn't take me far.

Again, if anyone sees that I am going about this problem in a roundabout or inefficient way, I would welcome redirection. Thank you.


Solution

  • Here is a slightly different way to approach your problem. It relies on the "trick" that inequalities evalute to 1 if true, and 0 if false. So you can evaluate a, b, c, and d against their limits, and sum the evaluation of the 4 inequalities.

    library(dplyr)
    
     ex_dat_tally<-ex_dat %>%
         mutate(tally=(a>a_lim)+(b>b_lim)+(c>c_lim)+(d>d_lim))