Search code examples
rdplyrmagrittr

Stepping through a pipeline with intermediate results


Is there a way to output the result of a pipeline at each step without doing it manually? (eg. without selecting and running only the selected chunks)

I often find myself running a pipeline line-by-line to remember what it was doing or when I am developing some analysis.

For example:

library(dplyr)

mtcars %>% 
  group_by(cyl) %>% 
  sample_frac(0.1) %>% 
  summarise(res = mean(mpg))
# Source: local data frame [3 x 2]
# 
# cyl  res
# 1   4 33.9
# 2   6 18.1
# 3   8 18.7

I'd to select and run:

mtcars %>% group_by(cyl)

and then...

mtcars %>% group_by(cyl) %>% sample_frac(0.1)

and so on...

But selecting and CMD/CTRL+ENTER in RStudio leaves a more efficient method to be desired.

Can this be done in code?

Is there a function which takes a pipeline and runs/digests it line by line showing output at each step in the console and you continue by pressing enter like in demos(...) or examples(...) of package guides


Solution

  • It is easy with magrittr function chain. For example define a function my_chain with:

    foo <- function(x) x + 1
    bar <- function(x) x + 1
    baz <- function(x) x + 1
    my_chain <- . %>% foo %>% bar %>% baz
    

    and get the final result of a chain as:

         > my_chain(0)
        [1] 3
    

    You can get a function list with functions(my_chain) and define a "stepper" function like this:

    stepper <- function(fun_chain, x, FUN = print) {
      f_list <- functions(fun_chain)
      for(i in seq_along(f_list)) {
        x <- f_list[[i]](x)
        FUN(x)
      }
      invisible(x)
    }
    

    And run the chain with interposed print function:

    stepper(my_chain, 0, print)
    
    # [1] 1
    # [1] 2
    # [1] 3
    

    Or with waiting for user input:

    stepper(my_chain, 0, function(x) {print(x); readline()})