I am working through some data and trying to do some conditional filtering. I want to write a statement that assesses if one variable is equal to a number (in this case, 1) and, if so, filters based on the value of another column. The result should be that all AtBatPitchSequences == 1 also have PitchType == "FA".
My data (firsttwopitches) looks like this:
YearID GameID GamePitchSequen~ PAofInning AtBatPitchSeque~ Inning Balls Strikes PitchType
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 2018 DFCBC~ 1 1 1 1 0 0 FA
2 2018 DFCBC~ 2 1 2 1 1 0 FA
3 2018 DFCBC~ 4 2 1 1 0 0 FA
4 2018 DFCBC~ 5 2 2 1 0 1 SI
5 2018 DFCBC~ 8 3 1 1 0 0 FA
6 2018 DFCBC~ 9 3 2 1 0 1 FA
To address this problem, I am trying to use an if statement:
library(tidyverse)
firsttwopitches %>%
if (AtBatPitchSequence == 1) {
filter(PitchType == "FA")
}
However, this raises an error and a warning:
Error in if (.) AtBatPitchSequence == 1 else { :
argument is not interpretable as logical
In addition: Warning message:
In if (.) AtBatPitchSequence == 1 else { :
the condition has length > 1 and only the first element will be used
I do not understand why my argument is not interpretable as logical. In my head, it should assess whether AtBatPitchSequence equals 1 or not, then move on to the next row. Also, what does the warning message mean? If this warning is dealt with by correcting my if statement, don't worry about it, but I am still new and am trying to debug my own work better. I read through this Error in if/while (condition) : argument is not interpretable as logical question and others to try and find my error but was unsuccessful.
Thank you very much
We can use a &
condition in filter
library(dplyr)
firsttwopitches %>%
filter(AtBatPitchSequence == 1, PitchType == "FA")
If we want to keep rows where 'AtBatPitchSequence' is not equal to 1, then add another expression with |
firsttwopitches %>%
filter((AtBatPitchSequence == 1 & PitchType == "FA")|AtBatPitchSequence != 1)
There are two issues - 1) if/else
are not vectorized, 2) related to the blocking of the code with {}
especially when it is used in a pipe (%>%
). A related issue is also in finding the column name AtBatPitchSequence
outside the tidyverse functions i.e mutate
, summarise
etc. In that case we need to specify the data as well .$AtBatPitchSequence
The error/warning can be reproduced with the inbuilt dataset
data(iris)
head(iris) %>%
if(Species == 'setosa') {
filter(Petal.Length > 1.5)
}
Error in if (.) Species == "setosa" else { : argument is not interpretable as logical In addition: Warning message: In if (.) Species == "setosa" else { : the condition has length > 1 and only the first element will be used
Now, we can remove the error by blocking within {}
, but note that the warning remains as if/else
is not vectorized and this could give an incorrect output as well (Below output is correct, but it is only because there was only one row with the TRUE condition met)
head(iris) %>%
{if(.$Species == 'setosa') {
filter(., Petal.Length > 1.5)
}}
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.4 3.9 1.7 0.4 setosa
Warning message: In if (.$Species == "setosa") { : the condition has length > 1 and only the first element will be used
If we use multiple expressions in filter
(,
will generate the &
)
head(iris) %>%
filter(Species == 'setosa', Petal.Length > 1.5)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.4 3.9 1.7 0.4 setosa