Search code examples
rfiltertidyversedata-quality

How to select all rows that have the same value in each column using tidyverse?


I'm working on data quality analysis for a questionnaire where respondents were asked to check mark every bit of food that they ate. Some respondents left the form blank so I'm trying to figure out a way to select or count all rows where each column is blank.

In this particular dataset, the value of "No" indicates that the box was not checked on the form. A value of "Yes" means that the box was checked. They are currently character variables in the dataset. It's easy to count the number of yes and no responses in a particular column, but I'm interested in counting rows where every single response is "NO."

To make things simple, let's say that there are 5 columns representing different foods on the chart: Apple, Banana, Sandwich, Yogurt, Strawberry (the actual dataset has over 40 different food items). How would I select every row where "No" is the response for each food item?

This is what I've tried so far and it isn't working:

food_history <- food %>%
     filter(Apple:Strawberry=="No")

Solution

  • We can use if_all

    library(dplyr)
    food %>%
       filter(if_all(Apple:Strawberry, ~ .x == "No"))