Search code examples
rdplyr

get number of elements after filtering in dplyr using length()


Using mtcars dataset, i want to count how many cars have at least 6 cylinders (cyl).

I use length() after filtering, and get a result of 11

library(dplyr)
mtcars %>%
  filter(
    cyl > 6
  ) %>%
  length()

However the code provided by tutorial is like this, and returns result of 14

library(dplyr)
mtcars %>%
  filter(cyl > 6) %>%
  summarise(n())

by viewing the result directly after filtering, it should also be 14

Now i have learnt that summarise(n()) is better Count number of rows by group using dplyr, and there are more better methods counting after filtering, but i am still confused why my code returns a different result and where the 11 comes from.

Thanks


Solution

  • When applied to a dataframe, length() returns number of columns not rows. But you can pull() variables to see their length(). So, your code counted number of columns instead of rows.

    library(dplyr)
    
    
    # Counts columns
    mtcars %>%
            filter(
                    cyl > 6
            ) %>%
            length()
    #> [1] 11
    
    # Counts rows
    mtcars %>%
            filter(cyl > 6) %>%
            summarise(n())
    #>   n()
    #> 1  14
    
    # Counts rows
    mtcars %>%
            filter(
                    cyl > 6
            ) %>% 
            pull() %>%
            length()
    #> [1] 14
    

    Created on 2024-05-12 with reprex v2.1.0

    Or you could have just used count() instead of length()