Search code examples
rggplot2shapesbubble-chart

Manually set shape by factor


An example dataset:

A <- c('a','b', 'c','d','e')
types <- factor(A)
B <- c(1,2,3,4,5)
C <- c(6,7,8,9,10)
D <- c(1,2,1,2,3)
ABC <- data.frame(B,C,D,types)

library(ggplot2)

ggplot(ABC, aes(x=B ,y=C ,size=D, colour=as.factor(types),label=types, shape=as.factor(types))) +
geom_point()+geom_text(size=2, hjust=0,colour="black", vjust=0) +
scale_size_area(max_size=20, "D", breaks=c(100,500,1000,3000,5000))  +
scale_x_log10(lim=c(0.05,10),breaks=c(0.1,1,10))+ scale_y_continuous(lim=c(0,30000000)) +
scale_shape_manual(values=c(15,18,16,17,19))`

Plotting this you will there are factors a-e that have colours and shapes attributed to them.

In my code I use scale_shape_manual to set the shapes and they are defined by sequence i.e. the order of factors is a,b,c,d,e and my values are 15,18,16,17,19 so a=15 (a square), b=18 etc etc

I would like to set these shapes by factor. My data will be changing each day and the factors will be in different orders but I always want the same factors to have the same shapes.

So obviously this code doesn't work but something like:

scale_shape_manual(values=('a'=15, 'b'=18, 'c'=16, 'd'=17, 'e'=19))

Would be helpful if I could do the same for colour too.


Solution

  • If I'm understanding you correctly, there will always be (at most) the five categories "a" - "e", and you want the shapes and colors for these to be consistent across datasets. Here is one way (note: gg_color_hue(...) is from here):

    # set up shapes
    shapes <- c(15,18,16,17,19)
    names(shapes) <- letters[1:5]
    
    # set up colors
    gg_color_hue <- function(n) { # ggplot default colors
      hues = seq(15, 375, length=n+1)
      hcl(h=hues, l=65, c=100)[1:n]
    }
    colors <- gg_color_hue(5)
    names(colors) <- names(shapes)
    
    # original data
    ggplot(ABC, aes(x=B ,y=C ,size=D, colour=types,label=types, shape=types)) +
      geom_point()+geom_text(size=2, hjust=0,colour="black", vjust=0) +
      scale_size_area(max_size=20, "D", breaks=c(100,500,1000,3000,5000))  +
      scale_x_log10(lim=c(0.05,10),breaks=c(0.1,1,10))+ 
      scale_y_continuous(lim=c(0,30000000)) +
      scale_shape_manual(values=shapes) + scale_color_manual(values=colors)
    

    #new data
    DEF <- data.frame(B,C,D,types=factor(c("a","a","a","d","e")))
    ggplot(DEF, aes(x=B ,y=C ,size=D, colour=types,label=types, shape=types)) +
      geom_point()+geom_text(size=2, hjust=0,colour="black", vjust=0) +
      scale_size_area(max_size=20, "D", breaks=c(100,500,1000,3000,5000))  +
      scale_x_log10(lim=c(0.05,10),breaks=c(0.1,1,10))+ 
      scale_y_continuous(lim=c(0,30000000)) +
      scale_shape_manual(values=shapes) + scale_color_manual(values=colors)