Piping histograms in dplyr (R)

Is it possible to pipe multiple graphs in dplyr.

This is working:

birdsss = data.frame(x1 = 1:10,x2 = 21:30,x3 = 41:50)
birdsss%>%  
  with(hist(x1, breaks = 50))

but this is not working:

birdsss%>%  
  with(hist(x1, breaks = 50)) %>%  
  with(hist(x2, breaks = 50)) %>%  
  with(hist(x3, breaks = 50))
Error in hist(x2, breaks = 50) : object 'x2' not found

I've also tried:

birdsss%>%  
  with(hist(x1, breaks = 50)) &  
  with(hist(x2, breaks = 50)) &  
  with(hist(x3, breaks = 50))

and

birdsss%>%  
  with(hist(x1, breaks = 50)) ;  
  with(hist(x2, breaks = 50)) ; 
  with(hist(x3, breaks = 50))

What could be the solution to print multiple columns in one line?

Something like:

 birdsss%>%  
      with(hist(x1:x3, breaks = 50))

I'm using a longer pipe (filter(), select(), etc.) and what to finish with multiple graph. I simplified the code here.

Solution

`lapply`

To put some of my comments from above into an answer, the simplest way make a histogram of each variable is

# let's put them in a single plot
par(mfrow = c(1, 3))

lapply(birdsss, hist, breaks = 50)    # or chain into it: birdsss %>% lapply(hist, breaks = 50)

# set back to normal
par(mfrow = c(1, 1))

This does mess up the labels, though:

`Map`/`mapply`

To fix this with base, we'd need to iterate in parallel over the data and the labels, which can be done with Map or mapply (since we don't care about results—only the side effects—the difference doesn't matter):

par(mfrow = c(1, 3))

Map(function(x, y){hist(x, breaks = 50, main = y, xlab = y)}, 
    birdsss, 
    names(birdsss))

par(mfrow = c(1, 1))

Much prettier. However, if you want to chain into it, you'll need to use the . to show where the data is supposed to go:

birdsss %>% 
    Map(function(x, y){hist(x, breaks = 50, main = y, xlab = y)}, 
        ., 
        names(.))

purrr

Hadley's purrr package makes *apply-style looping more obviously chainable (and though unrelated, working with lists easier) without worrying about .s. Here, since you're iterating for the side-effects and want to iterate over two variables, use walk2:

library(purrr)

walk2(birdsss, names(birdsss), ~hist(.x, breaks = 50, main = .y, xlab = .y))

which returns the exact same thing as the previous Map call (if you set mfrow the same way), though without useless output to the console. (If you want that information, use map2 instead.) Note that the parameters to iterate over come first, though, so you can easily chain:

birdsss %>% walk2(names(.), ~hist(.x, breaks = 50, main = .y, xlab = .y))

ggplot

On a completely different tack, if you're planning on sticking everything in a single plot eventually anyway, ggplot2 makes making related plots very easy with its facet_* functions:

library(ggplot2)

# gather to long form, so there is a variable of variables to split facets by
birdsss %>% 
    tidyr::gather(variable, value) %>% 
    ggplot(aes(value)) + 
        # it sets bins intead of breaks, so add 1
        geom_histogram(bins = 51) + 
        # make a new "facet" for each value of `variable` (formerly column names), and 
        # use a convenient x-scale instead of the same for all 3
        facet_wrap(~variable, scales = 'free_x')

It looks a bit different, but everything is editable. Note you get nice labels without any work.

Piping histograms in dplyr (R)

lapply

Map/mapply

purrr

ggplot

`lapply`

`Map`/`mapply`