I've stumble across the following namespace problem with R's subset function so many times, that I would like to ask for a more elegant solution than mine here:
Species <- 'setosa'
subset(iris, Species==Species)
returns the entire iris
dataset, I think because Species==Species
evaluates to true.
My solution would be
subset(iris, Species==get('Species', envir = .GlobalEnv)
but this would not work when the variable Species
is only defined within the scope of a function.
It would of course also be possible to use a different variable name like species
(lowercase) for the global variable.
However, I think this would actually be less readable and as a end user I would actually expect R
to allow this kind of comparison of two variables with the same name from different namespaces.
The base R subset
function simply does not handle the case of identical variable names well. I think that's one of the reasons the subset
help page contains the following warning
This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.
That message is basically suggesting you use
iris[iris$Species==Species, ]
An alternative to the get()
would be to use the globalenv()
function to get the global environment
subset(iris, Species==globalenv()$Species)
If you are using dplyrs filter()
function, there is a way to be explicit with the .env
pronoun
dplyr::filter(iris, Species==.env$Species)