Search code examples
ggplot2scaleshapesmanualgeom-text

geom_text() on faceted panels causes changes in symbol mapping


I am building a faceted plot where each panel has 3 different sets of points, each with a different shape. Everything works fine, until I try to add some text to each panel using geom_text(). When I include geom_text()in the plot, I get an error message that I have "insufficient values in a manual scale: 4 are needed but only 3 are provided". I can correct this problem by adding additional shapes and colors, but I cannot control the re-mapping of the shape/color factors that occurs when I add geom_text().

Here is the script that I am running:

library(ggplot2)
library(RColorBrewer)
library(cowplot)

make_hist_df <- function (vals, breaks, expt, type) {
  hist_d <- hist(vals,breaks=breaks,plot=FALSE)
  hist_d_nz<-hist_d$counts > 0
  n_d_nz<-length(hist_d$counts[hist_d_nz])
  hist_df <- data.frame(expt=character(n_d_nz), counts=numeric(n_d_nz), mids=numeric(n_d_nz),type=character(n_d_nz))
  hist_df$counts <- hist_d$counts[hist_d_nz]
  hist_df$mids <- hist_d$mids[hist_d_nz]
  hist_df$expt = expt
  hist_df$type = type

  return(hist_df)
  }

## get some normal distributions
n1<-rnorm(n=10000, mean=5,sd=1)
n2<-rnorm(n=5000,mean=15,sd=1)
n3<-rnorm(n=2000,mean=25,sd=1)

breaks=seq(0,30,0.5)

tot_hist_df = rbind(
  make_hist_df(n1,breaks,expt='one',type='low'),
  make_hist_df(n2,breaks,expt='one',type='mid'),
  make_hist_df(n3,breaks,expt='one',type='high')
  )

tot_hist_df = rbind(tot_hist_df,
  make_hist_df(n1,breaks,expt='two',type='low'),
  make_hist_df(n2,breaks,expt='two',type='mid'),
  make_hist_df(n3,breaks,expt='two',type='high')
  )

tot_hist_df$expt<-factor(tot_hist_df$expt,levels=c('one','two'), ordered=TRUE)
tot_hist_df$type<-factor(tot_hist_df$type,levels=c('low','mid','high'), ordered=TRUE)

s.open_circ<-1
s.closed_circ<-16
s.triangle <- 2
s.plus<-4
s.dot <- 20
sb.shapes   = c(s.open_circ, s.triangle, s.closed_circ)
sb.shapes_l = c(s.open_circ, s.triangle, s.closed_circ, s.dot)

q_set <- c('N1','N2','N3')
n_q <- length(q_set)

sb.colors <-brewer.pal(max(3,n_q),'Dark2') # 'Dark2', 'Set2', 'Paired'
sb.colors_l <- c(sb.colors,'black')

sb.sizes = rep(1.25,n_q)

## plot out without labels
p1 <- ggplot(data=tot_hist_df,aes(x=mids, y=counts, shape=type, color=type))+geom_point() +
  scale_color_manual(values=sb.colors) +
  scale_shape_manual(values=sb.shapes) +
  facet_wrap(~expt, ncol=2)

## make a label dataframe
hist_label=data.frame(expt=c('one','two'), lab=c('mean 5, 20','mean 5, 20 - dup'),type=c('xlab','xlab'))
hist_label$expt <- factor(hist_label$expt,levels=c('one','two'),ordered=TRUE)
hist_label$type <- factor(hist_label$type,levels=c('low','mid','high','xlab'),ordered=TRUE)

## plot out without labels
p2 <- ggplot(data=tot_hist_df,aes(x=mids, y=counts, shape=type, color=type))+geom_point() +
  geom_text(data=hist_label, aes(x=10,y=1000,label=lab)) +
  scale_color_manual(values=sb.colors_l) +
  scale_shape_manual(values=sb.shapes_l) +
  facet_wrap(~expt, ncol=2)

plot_grid(p1,p2,ncol=1)

And here is the output that it produces: bad plot

Both the colors and shapes differ between the top and bottom panel.

I do not understand why geom_text() is remapping the factor levels specified by "type", after I have explicitly specified them for both the plotted data and for the label data structure. This remapping (to something that looks alphabetical) throws off both the colors and the shapes.


Solution

  • Thank you for the reproducible example. It was much clearer to understand what was happening now.

    The remapping behaviour you describe is because while learning the discrete values (training the scale), it doesn't smartly update the labels it learned from the histogram data with the labels it learns from the text data. To fix this bit, you can manually set the limits of the scale.

    ggplot(data=tot_hist_df,
           aes(x=mids, y=counts, shape=type, color=type))+
      geom_point() +
      geom_text(data=hist_label, 
                aes(x=10,y=1000,label=lab)) +
      scale_color_manual(values=sb.colors_l, limits = levels(hist_label$type)) +
      scale_shape_manual(values=sb.shapes_l, limits = levels(hist_label$type)) +
      facet_wrap(~expt, ncol=2)
    

    enter image description here

    EDIT:

    Or alternatively, you can set drop = FALSE in the manual scales:

    ggplot(data=tot_hist_df,
           aes(x=mids, y=counts, shape=type, color=type))+
      geom_point() +
      geom_text(data=hist_label, 
                aes(x=10,y=1000,label=lab)) +
      scale_color_manual(values=sb.colors_l, drop = FALSE) +
      scale_shape_manual(values=sb.shapes_l, drop = FALSE) +
      facet_wrap(~expt, ncol=2)
    

    END EDIT

    Below follow a few tips that you might find useful. The first is that you can use named vectors in manual discrete scales, that will keep the link between the symbol and the colour/shape regardless of whether the symbol is in the data.

    This is also useful when you have to make multiple similar plots and you don't know whether only a subset of categories is present in any single plot.

    Notice in the plot below the colours are correctly matched to the data, and no legend item is shown for the not used not_in_data symbol/colour mapping.

    my_colours <- setNames(object = c(brewer.pal(3, 'Dark2'), 'black', 'blue'), 
                           nm = c(levels(hist_label$type), "not_in_data"))
    my_shapes <- setNames(object = c(1, 2, 16, NA), 
                          nm = levels(hist_label$type))
    
    ggplot(data=tot_hist_df,
           aes(x=mids, y=counts, shape=type, color=type))+
      geom_point() +
      geom_text(data=hist_label, 
                aes(x=10,y=1000,label=lab)) +
      scale_color_manual(values=my_colours) +
      scale_shape_manual(values=my_shapes) +
      facet_wrap(~expt, ncol=2)
    

    enter image description here

    The second tip is that you can escape having to include the text in the legend by having the text layer not inherit the main plot's aesthetics (which assigns shape and color to the text layer, which it probably doesn't need). Also note that since the scale doesn't need to update for the text's colour and shape, the correct order of factor levels from the histogram is restored.

    ggplot(data=tot_hist_df,
           aes(x=mids, y=counts, shape=type, color=type)) + 
      geom_point() +
      geom_text(data=hist_label, 
                aes(x=10,y=1000, label=lab),
                inherit.aes = FALSE) +
      scale_color_manual(values=my_colours) +
      scale_shape_manual(values=my_shapes) +
      facet_wrap(~expt, ncol=2)
    

    enter image description here

    An alternative way to achieve the same thing, it to define not aesthetics in the main plot call, but call them in the point layer instead (plot is identical to above).

    ggplot(data=tot_hist_df) + 
      geom_point(aes(x=mids, y=counts, shape=type, color=type)) +
      geom_text(data=hist_label, 
                aes(x=10,y=1000, label = lab)) +
      scale_color_manual(values=sb.colors) +
      scale_shape_manual(values=sb.shapes) +
      facet_wrap(~expt, ncol=2)
    

    I hope that helped!