Given the following dataframe:
df <- data.frame(a=c(NA,1,2), b=c(3,4,5))
I can pass column name as string in select
:
> df %>% select("a")
a
1 NA
2 1
3 2
Or I can use symbolic names with select
. That's fine too:
> df %>% select(a)
a
1 NA
2 1
3 2
pull
accepts both as well:
> df %>% pull("a")
[1] NA 1 2
> df %>% pull(a)
[1] NA 1 2
But I cannot use strings with filter
:
> df %>% filter("a"==1)
[1] a b
<0 rows> (or 0-length row.names)
only symbolic names:
> df %>% filter(a==1)
a b
1 1 4
Why it works with select
but not with filter
?
Shouldn't it be consistent?
"a" is an argument to select
and pull
but is not an argument to filter
so the situations are not the same. Also the code shown here which returns rows for which column a
equals the letter "a" would no longer work if it were allowed to interpret "a" as column a
.
data.frame(a = letters) %>% filter( a == "a" )
## a
## 1 a
1) dplyr provides if_any
and if_all
library(dplyr)
df %>%
filter(if_any("a") == 1)
## a b
## 1 1 4
2) Although filter_at
has been superseded by the syntax in (1), superseded is not the same as deprecated and it will continue to be available so it is ok to use it though not preferred by the dplyr developers.
df %>%
filter_at("a", all_vars(. == 1))
## a b
## 1 1 4
Also note that this used to work and actually still does with a warning but in the future it will not work at all as it has been deprecated so do not use:
# deprecated - do not use
df %>%
filter(across("a", ~ . == 1))
Input from question
df <- data.frame(a = c(NA, 1, 2), b = c(3, 4, 5))