Search code examples
rggplot2positionlegendgeom-point

Colors and Shapes in geom_point legend


Hi there I have a very simple problem I think, but I'm not sure how to fix it.

Basically, I have a geom_point plot where I have two approaches used and I wish to both differentiate shapes and colors for each of the two. At first it seems to be working correctly; however, once I get to the legend something weird happens...

It looks like R cannot recognize the guides argument and, specifically, I cannot set my legend title to something I chose neither I can move it on top of the legend itself. Additionally, it appears R picks up the geom_label_repel as the icons shown in the legend...

If there is a fix to this, any suggestion would be appreciated. Thanks in advance!

See below for the plot and the code enter image description here

library(readxl)
library(ggplot2)
library(ggrepel)
library(RColorBrewer)

excel_variants <- read_excel("/path/to/papuan_samples_variant-plot.xlsx", 2)

df_variants <- data.frame(excel_variants)

df_variants$sample <- factor(df_variants$sample, levels=c('6103675', '6103673', '6103671'))
#df_variants$approach <- factor(df_variants$approach, levels=c('filtered', 'personalized'))

### PLOT THE DATA
plot <- 
  
  ggplot(df_variants, aes(x=sample, y=value, color=approach, shape=approach)) + geom_point(position=position_dodge(width=0.15), size=2) +
  
  theme(plot.title=element_text(face='bold.italic', hjust=.5), legend.title=element_text(face='italic'), legend.position='bottom', legend.direction='horizontal') +
  scale_shape_manual(values=c('filtered'=15, 'personalized'=16)) + scale_color_manual(values=brewer.pal(11, "PRGn")[c(3, 9)]) + 
  ggtitle("SVs metrics") + guides(values=guide_legend(title='graph used', title.position='top', title.hjust=.5)) + 
  coord_flip()
plot

plot + geom_label_repel(aes(label=labels),
                 box.padding  = 0.5, 
                 point.padding = 0,
                 segment.color = 'grey50')

EDIT: data frame structure

structure(list(sample = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 2L, 
2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), levels = c("6103675", 
"6103673", "6103671"), class = "factor"), approach = c("personalized", 
"personalized", "personalized", "filtered", "filtered", "filtered", 
"personalized", "personalized", "personalized", "filtered", "filtered", 
"filtered", "personalized", "personalized", "personalized", "filtered", 
"filtered", "filtered"), value = c(0.997, 0.5088, 0.6737, 0.9997, 
0.3329, 0.4995, 0.9973, 0.5174, 0.6814, 0.9998, 0.338, 0.5052, 
0.9971, 0.5031, 0.6688, 0.9999, 0.3335, 0.5001), labels = c("precision", 
"recall", "F1", "precision", "recall", "F1", "precision", "recall", 
"F1", "precision", "recall", "F1", "precision", "recall", "F1", 
"precision", "recall", "F1")), row.names = c(NA, -18L), class = "data.frame")

Solution

  • You may try to assign name parameter in your scale_*() to add names.

    Try to add theme(legend.position = "top") to set legend position.

    Add show.legend = F to geom_label_repel() call to remove letters from legend:

    plot <- 
      
      ggplot(df_variants, aes(x=sample, y=value, color=approach, shape=approach)) + geom_point(position=position_dodge(width=0.15), size=2) +
      
      theme(plot.title=element_text(face='bold.italic', hjust=.5), legend.title=element_text(face='italic'), legend.position='bottom', legend.direction='horizontal') +
      scale_shape_manual(values=c('filtered'=15, 'personalized'=16), name = "My Legend") + 
      scale_color_manual(values=brewer.pal(11, "PRGn")[c(3, 9)], name = "My Legend") + 
      ggtitle("SVs metrics") + guides(values=guide_legend(title='graph used', title.position='top', title.hjust=.5)) + 
      coord_flip()
    
    
    plot + geom_label_repel(aes(label=labels),
                            box.padding  = 0.5, 
                            point.padding = 0,
                            segment.color = 'grey50', 
                            show.legend = F)+
      theme(legend.position = "top")
    

    Note that the I could not debug the script as there was no data provided in your question. Please do so calling dput(yourDataFrame) and provide the result in your future questions.