Search code examples
rdataframetidyversesummarize

count nonzero values in each column tidyverse


I have a df with a bunch of sites and a bunch of variables. I need to count the number of non-zero values for each site. I feel like I should be able to do this with summarize() and count() or tally(), but can't quite figure it out.

reprex:


df <- 
  tribble(
    ~variable,   ~site1,   ~site2,  ~site3,
    "var1",        0 ,       1,        0,
    "var2",        .5,       0,        0,
    "var3",        .1,       2,        0,
    "var4",        0,        .8,       1
  )


# does not work:
df %>%
  summarise(across(where(is.numeric), ~ count(.x>0)))

desired output:

# A tibble: 1 × 3
  site1 site2 site3
  <dbl> <dbl> <dbl>
1   2     3     1

Solution

  • A possible solution:

    library(dplyr)
    
    df %>% 
      summarise(across(starts_with("site"), ~ sum(.x != 0)))
    
    #> # A tibble: 1 × 3
    #>   site1 site2 site3
    #>   <int> <int> <int>
    #> 1     2     3     1
    

    Another possible solution, in base R:

    apply(df[-1], 2, \(x) sum(x != 0))
    
    #> site1 site2 site3 
    #>     2     3     1