Search code examples
rdplyrdata-munging

Why is my 'if' argument not interpretable as logical


I am working through some data and trying to do some conditional filtering. I want to write a statement that assesses if one variable is equal to a number (in this case, 1) and, if so, filters based on the value of another column. The result should be that all AtBatPitchSequences == 1 also have PitchType == "FA".

  • Please note that if AtBatPitchSequence > 1 it should not be filtered, so row 4 should be kept after the filter

My data (firsttwopitches) looks like this:

  YearID GameID GamePitchSequen~ PAofInning AtBatPitchSeque~ Inning Balls Strikes PitchType
   <dbl> <chr>             <dbl>      <dbl>            <dbl>  <dbl> <dbl>   <dbl>     <chr>
1   2018 DFCBC~                1          1                1      1     0       0        FA
2   2018 DFCBC~                2          1                2      1     1       0        FA
3   2018 DFCBC~                4          2                1      1     0       0        FA
4   2018 DFCBC~                5          2                2      1     0       1        SI
5   2018 DFCBC~                8          3                1      1     0       0        FA
6   2018 DFCBC~                9          3                2      1     0       1        FA

To address this problem, I am trying to use an if statement:

library(tidyverse)

firsttwopitches %>%
  if (AtBatPitchSequence == 1) {
    filter(PitchType == "FA")
  }

However, this raises an error and a warning:

Error in if (.) AtBatPitchSequence == 1 else { : 
  argument is not interpretable as logical
In addition: Warning message:
In if (.) AtBatPitchSequence == 1 else { :
  the condition has length > 1 and only the first element will be used

I do not understand why my argument is not interpretable as logical. In my head, it should assess whether AtBatPitchSequence equals 1 or not, then move on to the next row. Also, what does the warning message mean? If this warning is dealt with by correcting my if statement, don't worry about it, but I am still new and am trying to debug my own work better. I read through this Error in if/while (condition) : argument is not interpretable as logical question and others to try and find my error but was unsuccessful.

Thank you very much


Solution

  • We can use a & condition in filter

    library(dplyr)
    firsttwopitches %>%   
       filter(AtBatPitchSequence == 1, PitchType == "FA")
    

    If we want to keep rows where 'AtBatPitchSequence' is not equal to 1, then add another expression with |

    firsttwopitches %>% 
        filter((AtBatPitchSequence == 1 & PitchType == "FA")|AtBatPitchSequence != 1) 
    

    There are two issues - 1) if/else are not vectorized, 2) related to the blocking of the code with {} especially when it is used in a pipe (%>%). A related issue is also in finding the column name AtBatPitchSequence outside the tidyverse functions i.e mutate, summarise etc. In that case we need to specify the data as well .$AtBatPitchSequence


    The error/warning can be reproduced with the inbuilt dataset

    data(iris)
    head(iris) %>% 
       if(Species == 'setosa') {
           filter(Petal.Length > 1.5)
        }
    

    Error in if (.) Species == "setosa" else { : argument is not interpretable as logical In addition: Warning message: In if (.) Species == "setosa" else { : the condition has length > 1 and only the first element will be used

    Now, we can remove the error by blocking within {}, but note that the warning remains as if/else is not vectorized and this could give an incorrect output as well (Below output is correct, but it is only because there was only one row with the TRUE condition met)

    head(iris) %>% 
        {if(.$Species == 'setosa') {
            filter(., Petal.Length > 1.5)
         }}
    #  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
    #1          5.4         3.9          1.7         0.4  setosa
    

    Warning message: In if (.$Species == "setosa") { : the condition has length > 1 and only the first element will be used

    If we use multiple expressions in filter (, will generate the &)

    head(iris) %>% 
        filter(Species == 'setosa', Petal.Length > 1.5)
    #  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
    #1          5.4         3.9          1.7         0.4  setosa