Search code examples
rplotlylazy-evaluationtilde

Explanation of tilde in function arguments


I try to understang how to plot cumulative lines animation with plotly. The code from "Plotly R Open Source Graphing Library" is below:

library(plotly)

accumulate_by <- function(dat, var) {
  var <- lazyeval::f_eval(var, dat)
  lvls <- plotly:::getLevels(var)
  dats <- lapply(seq_along(lvls), function(x) {
    cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
  })
  dplyr::bind_rows(dats)
}

df <- txhousing 
fig <- df %>%
  filter(year > 2005, city %in% c("Abilene", "Bay Area"))
fig <- fig %>% accumulate_by(~date)
  1. The main question is to explain what happens when we pass the ~date to the accumulate_by function. Which values dat and var variables will get? And how it works?
  2. If I understand what values the var and dat variables take, it becomes clear what the f_eval function does, but now I don’t understand this.
  3. What plotly:::getLevels is? I mean, I did not find any documentation about this function.

Solution

    1. The main question is to explain what happens when we pass the ~date to the accumulate_by function. Which values dat and var variables will get? And how it works?

      accumulate_by will take whatever values of dat and var are passed to it, in that order. What this means is if you pass accumulate_by(Var1, Var2) this is the same as accumulate_by(dat = Var1, var = Var2). This is called positional matching.

      As the code is written fig is your dat variable, since it comes into accumulate_by via the pipe (%>%) in the first position. ~date is your var variable, because it's in the second position.

      the ~ in front of date means "by", so "by date".

      What accummlate_by is actually doing is collecting all rows and all variables with a given date value in a new variable called level. It's then progressing to the next date value and collecting all rows and variables for that date, plus all rows and variables for all the preceding values of date and giving that another value for level. You can check str(fig) before and after applying accumulate_by to see that fig gets much longer (many more rows) and gains another variable called level after applying accumulate_by.

    2. If I understand what values the var and dat variables take, it becomes clear what the f_eval function does, but now I don’t understand this.

      Addressed (I hope) in the answer to question 1.

    3. What plotly:::getLevels is? I mean, I did not find any documentation about this function.

      The triple colon means "search entire package (in this case the package is plotly) including non-exported items in the package". What this means in practice is that lots of packages have "helper" functions and other utilities that aren't usually used by end-users. Instead those helper functions are used by other functions in the package. Helper functions aren't "exported", that is, generally available by just entering their names. One can access them though, by using the ::: triple colon.

      Because getLevels is not really intended to be used by plotly package users it likely isn't documented.