Search code examples
r

Warning message in R package 'Formula'


When using Formula() or as.Formula() from the Formula package, I get a warning message. It does not appear to affect functionality, but I have not been able to understand the source of it.

I am using the Formula package to update multipart formulas (for ivreg() in the AER package, but that is unrelated to the problem). After using Formula() or as.Formula() on a formula object, the next line of code that I run produces a warning message. I have been through the documentation and GitHub repo but cannot understand the source of it.

library(Formula)
f1 <- y ~ x1 + x2 | z1 + z2 + z3
F1 <- Formula(f1)
class(F1)
> class(F1)
[1] "Formula" "formula"
Warning message:
In is.name(callee) && length(object) > 20 :
  'length(x) = 2 > 1' in coercion to 'logical(1)'

To be clear, it is not class(F1) specifically that produces this warning. For example:

> F1 <- Formula(f1)
> print("lol")
[1] "lol"
Warning message:
In is.name(callee) && length(object) > 20 :
  'length(x) = 2 > 1' in coercion to 'logical(1)'

I have emailed the package author.

The problem appears unrelated to other packages:

> (.packages())
[1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"   "base"   

However, the warning message is not reproduced in RGui, so it might have to do with RStudio(?)


Solution

  • The warning is triggered in the following way: When an object is created in RStudio, the .rs.describeObject() function from tools:rstudio is triggered to obtain information about the object. Among other things this uses the .rs.sanitizeCall() function which contains the following line:

    long <- is.name(callee) && length(object) > 20
    

    Thus, this assumes that length() returns a single number which, unfortunately, is not the case for Formula() objects:

    f <- Formula(y ~ x | z)
    length(f)
    ## [1] 1 2
    

    So there isn't anything that the Formula package could do to avoid the warning - other than breaking the behavior of its length() method which has been in place for almost 1.5 decades.

    In hindsight it was probably not the best decision to make the length() method behave this way. This even made it to the official ?length documentation in base R:

    Warning:

    Package authors have written methods that return a result of length other than one ('Formula') and that return a vector of type 'double' ('Matrix'), even with non-integer values (earlier versions of 'sets'). Where a single double value is returned that can be represented as an integer it is returned as a length-one integer vector.

    One way to avoid this issue would be to use

    `length(object)[1L] > 20`
    

    or

    `any(length(object) > 20)`
    

    in .rs.sanitizeCall(). I would recommend reporting this to the RStudio developers.