Search code examples
rggplot2axis-labels

R: Complex Axis Format Expression


I am using ggplot2 and trying to incorporate a complex expression using paste and expression altogether.

For example, I am trying to show a value of 0.5e-6 as 0.5 micro second and also (5 x 10^-7 seconds) in the y-axis labels.

So far I am able to do either of this, but not both. A minimal working example is given below.

library(ggplot2)

dat <- data.frame(
  A = factor(c("O", "O", "P", "P", "Q", "Q", "O", "O", "P", "P", "Q", "Q"), levels=c("O", "O", "P", "P", "Q", "Q","O", "O", "P", "P", "Q", "Q")),
  B = factor(c("P-0.1", "P-0.1", "P-0.1", "P-0.1","P-0.1", "P-0.1",  "P-0.2", "P-0.2", "P-0.2", "P-0.2", "P-0.2", "P-0.2"), levels = c("P-0.1", "P-0.1", "P-0.1", "P-0.1","P-0.1", "P-0.1",  "P-0.2", "P-0.2", "P-0.2", "P-0.2", "P-0.2", "P-0.2")),
  X = c( 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1),
  Y = c(1e-6, 1.5e-6, 1.2e-6, 1.3e-6, 0.9e-6, 1.4e-6, 3.0e-6, 2.0e-6, 3.2e-6, 2.1e-6, 2.7e-6, 1.9e-6)
)

fancy_scientific_text <- function(l) {
  # turn in to character string in scientific notation
  l <- format(l, scientific = TRUE)
  # quote the part before the exponent to keep all the digits
  l <- gsub("^(.*)e", "'\\1'e", l)
  l <- gsub("e\\+","e",l)
  # turn the 'e+' into plotmath format
  l <- gsub("e", "%*%10^", l)
  # print (l)
  l <- gsub("\\'1[\\.0]*\\'\\%\\*\\%", "", l)
  l <- gsub("\\'0[\\.0]*\\'\\%\\*\\%10\\^00", "0", l)
  return(l)
}

fancy_scientific <- function(l) {
  # return this as an expression
  parse(text=fancy_scientific_text(l))
}


human_time_format <- function(y){
  if (!is.na(y)){
    substitute(paste(m, " ", mu, "s", sep=""), list(m=y*1e6))
  }
}

human_times <- function(x = NULL, smbl ="sec"){
  sapply(x, human_time_format)
}

human_time_format_combined <- function(y){
  if (!is.na(y)){
    substitute(paste(y_lab, " (", m, " ", mu, "s)", sep=""), list(m=y*1e6, y_lab=fancy_scientific(y)))
  }
}

human_times_combined <- function(x = NULL, smbl ="sec"){
  sapply(x, human_time_format_combined)
}



p = ggplot(data=dat, aes(x=X, y=Y, colour=A, size=A, shape=A, linetype=A, fill=B, group=interaction(A,B))) + geom_point() + geom_line() + theme_bw()
p = p + geom_point(size=4, alpha=0) + geom_point(size=4, show.legend=FALSE) + guides(shape = guide_legend(nrow=3, byrow = TRUE, keywidth = 1.5, keyheight = 1), colour = guide_legend(override.aes = list(alpha=1)))

p = p + scale_shape_manual(name="", values=c(21,22,23))
p = p + scale_colour_manual(name="", values=c("#005ccc", "#007700", "#56B4E9"))
p = p + scale_linetype_manual(name="", values=c(0,0,1))
p = p + scale_size_manual(name="", values = c(1, 1, 1))
p = p + scale_fill_manual(name = "", values = c("red", "blue"), guide = guide_legend(override.aes = list(shape = 22, size = 5)))

p0 = p + ggtitle("p0")
p1 = p + scale_y_continuous(name = "Y", labels = fancy_scientific) + ggtitle("p1")
p2 = p + scale_y_continuous(name = "Y", labels = human_times) + ggtitle("p2")
p3 = p + scale_y_continuous(name = "Y", labels = human_times_combined) + ggtitle("p3")

And here is the output: enter image description here

p0 is the unformatted version. p1 is a version with scientific format, p2 is a version with metric unit format, and p3 is the intended format which have both p1 and p2's format. But I could not capture the scientific format here.


Solution

  • This is basically what I put in the comment. You can't embed an "expression" inside another expression. You need to combine the calls inside the expression. You can get at the contents of the expression via indexing. So if you change the human_time_format_combined function to extract the contents of the expression returned by fancy_scientific, you'll be all set

    human_time_format_combined <- function(y){
      if (!is.na(y)){
        substitute(paste(y_lab, " (", m, " ", mu, "s)", sep=""), 
            list(m=y*1e6, y_lab=fancy_scientific(y)[[1]]))
      }
    }
    

    Then p3 will return

    enter image description here

    Also note that you often don't need paste() since it doesn't do exactly what you think it does in the context of ?plotmath expressions. It can often be replaced by *. For example

    human_time_format <- function(y){
      if (!is.na(y)){
        substitute(m*mu*"s", list(m=y*1e6))
      }
    }
    
    human_time_format_combined <- function(y){
      if (!is.na(y)){
        substitute(y_lab~~(m*" "*mu*"s"), list(m=y*1e6, y_lab=fancy_scientific(y)[[1]]))
      }
    }