Search code examples
rplotlapplyglm

How to save plots by names of items in a list


I can save a bunch of plots but it names them as the first value from each list item rather than the name of the variable.

delm2<-data.frame(N = c(5.881, 5.671, 7.628, 4.643, 6.598, 4.485, 4.465, 4.978, 4.698, 3.685, 4.915, 4.983, 3.288, 5.455, 5.411, 2.585, 4.321, 4.661), 
                  t1 = c("N", "N", "T", "T", "N", "N", "T", "N", "N", "N", "N", "T", "T", "T", "T", "T", "T", "N"), 
                  t3 = c("r","v", "r", "v", "v", "r", "c", "c", "v", "r", "c", "c", "r", "v","c", "r", "v", "c"), 
                  B = c(1.3, 1.3, 1.33, 1.25, 1.4, 1.34, 1.36, 1.39, 1.36, 1.42, 1.38, 1.31, 1.37, 1.44, 1.22, 1.4, 1.46, 1.35))


library(boot)

lapply(as.list(delm2[,c('N','B')]), 
       function(i){
         bmp(filename = paste0(i,".bmp"), width = 350, height = 400)
         glm.diag.plots(glm(i ~ t1*t3,data=del))
         dev.off()   
       })

This saves the plots but they are named with number values from the data rather than the name of each target of lapply... i.e. current output is two files named "5.881" and "1.3", when I want the same two files but named "N" and "B"

I thought I could change paste0(i,".bmp") to paste0(names(i),".bmp") but that just saves the first one, with no name at all.

It looks like you can give names that are just integers in How to save and name multiple plots with R but I want the names of the variables from the list or the two numerics N and B in delm2.

It looks from Saving a list of plots by their names() like this would be easier with ggplot output but ggsave didn't work on one glm.diag.plots output.


Solution

  • (Disregard my previous suggestion using Map.)

    The big takeaway is how to derive the formula dynamically. One way is with as.formula, which takes a string and converts into a formula that can be used in a model-generating function (for example).

    One problem with using lapply(as.list(delm2[,c('N','B')]), ...) is that the remainder of the data (i.e., columns t1 and t3) are not passed, just one vector at a time. (I'm wondering if your reference to del is a typo, un-released/hidden data, or something else.)

    Try this:

    lapply(c("N", "B"), function(nm) {
      bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
      glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = delm2))
      dev.off()
    })
    

    In general, I don't like breaching scope inside these functions. That is, I try to not reach outside lapply for data when I can pass it fairly easily. The above in a pedantic-language way could look like:

    lapply(c("N", "B"), function(nm, x) {
      bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
      glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = x))
      dev.off()
    }, x = delm2)
    

    While this preserves scope, it may be confusing if you do not understand what is going on.

    This might be a great time to use for instead of one of the *apply* functions. Everything you want is in side-effect, and since for and *apply are effectively the same speed, you gain readability:

    for (nm in c("N", "B")) {
      bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
      glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = delm2))
      dev.off()
    }
    

    (In this case, there is no "scope breach", so I used the original variable.)


    Parenthetically, to tie-in my now-edited-out and incorrect answer that included Map. Here's an example using Map that does more to demonstrate what Map (and mapply) are doing, vice actually improving on your immediate need.

    If for some reason you wanted them named something distinct from "N.bmp", you could do this:

    Map(function(fn, vn, x) {
      bmp(filename = paste(nm, ".bmp"), width = 350, height = 400)
      glm.diag.plots(glm(as.formula(paste(nm, "~ t1*t3")), data = x))
      dev.off()
    }, c("N2.bmp", "b3456.bmp"), c("N", "B"), list(delm2))
    

    Two things to note from this:

    • The use of list(delm2) is to wrap that structure into a single "thing" that is passed repeated to the mapped function. If we did just delm2 (no list(...)), then it would try to use the first column of delm2 with each of the first elements. This might be useful in other scenarios, but in your glm example you need other columns present, so you cannot include just one column at a time. (Well, there are ways to do that, too ... but important at the moment.)
    • The first time the anonymous function is called, fn is "N2.bmp", vnis"N", andxis the full dataset ofdelm2. The second time the anon-func is called,fnis"b3456.bmp",vnis"B", andxis again the full dataset ofdelm2`.

    I label this portion "parenthetic" because it really doesn't add to this problem, but since I started that way in my first-cut answer, I thought I'd continue with the methodology, the "why" of my choice of Map. In the end, I think the for solution or one of the lapply solutions should be fine for you.