Search code examples
roperatorsbinary-operators

Escaping Custom Operators in R


One may define a new binary operator...

`%my_op%` <- function(lhs, rhs) {
  # ...
}

...and use it like so:

4 %my_op% 5

Experimentation shows that nearly any sequence of characters is valid, between the opening % "bookends". The definition need only escape (\) the backtick (`) used for complex names, along with the escape character (\) itself:

`%space op%` <- sum
`%tick\`op%` <- sum
`%slash\\op%` <- sum

When the operators are actually used, the escape sequence is unnecessary, thanks (apparently) to the syntactic significance of the % bookends.

4 %space op% 5
#> [1] 9

4 %tick`op% 5
#> [1] 9

4 %slash\op% 5
#> [1] 9

Problem

Is it possible to include the % character itself in the operator name? I have tried several approaches to define such an operator..

# No escape.
`%test%op%` <- sum

# Conventional escape (by backslash).
`%test\%op%` <- sum
#> Error: '\%' is an unrecognized escape in character string starting "`%test\%"

# Conventional escape (by escaped backslash)..
`%test\\%op%` <- sum

# Reflexive escape (by doubling).
`%test%%op%` <- sum

...but they are syntactically unviable:

4 %test%op% 5
#> Error: unexpected input in "4 %test%op% 5"

4 %test\%op% 5
#> Error: unexpected input in "4 %test\%op% 5"

4 %test%%op% 5
#> Error: unexpected SPECIAL in "4 %test%%op%"

Questions

  1. Is there any way (ie. escape sequence) to define an operator with % in its name? If so, how would one call this operator when using it?
  2. What complex names (`...\`...\\...`) are valid for R variables but not for custom operators? And vice versa?

Solution

  • There is no problem in defining a binary function with an extra % in it:

    `%my % func%` <- function(a, b) paste0(round(100 * a / b, 3), '%')
    

    But we cannot call it using the special binary operator syntax:

    5 %my % func% 10
    #> Error: unexpected input in "5 %my % func% 10"
    

    We would have to call it like any other normal function:

    `%my % func%`(5, 10)
    #> [1] "50%"
    

    The reason is that when the R parser tokenizes input, it goes into a special state when it comes across a % that hasn't been escaped by a backtick, reading the tokens until it comes across the very next %, and reading the characters in between literally. Backticks, quotes, backslashes etc makes no difference to how the second % is interpreted by the parser.

    In fact, we can even have a binary function with all these symbols in it (it's easier to define this using assign

    assign('% `"@\\ %', function(a, b) paste0(round(100 * a / b, 3), '%'))
    

    Which allows the rather bizarre but legal R code:

    5 % `"@\ % 10
    #> [1] "50%"
    

    There are ways to have something that looks like a binary operator with a % in it if you get creative with non-standard evaluation:

    `%_%` <- function(a, b) {
      if(deparse(as.call(substitute(a))[[1]]) == "%_%")
        paste0(round(100 * as.call(substitute(a))[[2]]/b, 3), "%")
    }
    

    Which allows:

    1 %_%percent%_% 2
    #> [1] "50%"
    

    But this is really just two binary operators strung together and ignoring the symbol in between. You can't use a special binary operator as a special binary operator with a single % in it.