Given the following data ...
x <- data.frame("Y" = 2000:2010,
"A" = c(0, NA, 1, 1, 0, 1, NA, NA, 1, 0, NA),
"B" = c(0, 0, NA, 1, 1, 0, NA, NA, 0, 1, NA))
... I was able to remove all rows containing NA values only from specific columns following this wonderful answer.
x |> dplyr::filter(if_any(c("A", "B"), ~ !is.na(.x)))
#> Y A B
#> 1 2000 0 0
#> 2 2001 NA 0
#> 3 2002 1 NA
#> 4 2003 1 1
#> 5 2004 0 1
#> 6 2005 1 0
#> 7 2008 1 0
#> 8 2009 0 1
As I don't have much experience with dplyr yet, I can't figure out how this expression needs to be modified if I wanted filter()
to be applied only on specific rows, e.g. the last one.
The expected result would look like this, dropping Y = 2010 and keeping Y = 2006 and Y = 2007:
#> Y A B
#> 1 2000 0 0
#> 2 2001 NA 0
#> 3 2002 1 NA
#> 4 2003 1 1
#> 5 2004 0 1
#> 6 2005 1 0
#> 7 2006 NA NA
#> 8 2007 NA NA
#> 9 2008 1 0
#> 10 2009 0 1
In the specific case of focussing on the last row one can use row_number
in combination with n
.
See ?cur_group
for more on these variables
library(dplyr)
x %>%
filter(!(row_number() == n() & is.na(if_any(A:B))))