Search code examples
rplotscaling

geom_point points manual scaling


I got some data (named result.df) which looks like the following:

    orgaName                  abundance          pVal         score        
     A                        3          9.998622e-01     1.795338e-04
     B                        2          9.999790e-01     1.823428e-05
     C                        1          2.225074e-308    3.076527e+02
     D                        1          3.510957e-01     4.545745e-01

and so on...

What I am now plotting is this:

p1 <- ggplot(result.df, aes(log2(abundance), (1-pVal), label=orgaName)) +
   ylab("1 - P-Value")+
   xlab("log2(abundance)")+
   geom_point(aes(size=score))+
   ggtitle(colnames(case.count.matrix)[i])+
   geom_text(data=subset(result.df, pVal < 0.05),hjust=.65, vjust=-1.2,size=2.5)+       
   geom_hline(aes(yintercept=.95), colour="blue", linetype="dashed")+
   theme_classic()

Everything works fine and looks rather fine. However, what I would like is to scale the point size introduced through

geom_point(aes(size=score))+

to be scaled against fixed values. So the legend should scale in a decadic logarithm but the score should stay the same. Such that low scores nearly disappear and large scores are kind of comparable with respect to their point size between different "result.df".

EDIT

After checking on the comments of @roman and @vrajs5 I was able to produce a plot like this new plot. Using the following code:

   ggplot(result.df, aes(log2(abundance), (1-pVal), label=orgaName)) +
   ylab("1 - P-Value")+
   xlab("log2(abundance)")+
   geom_point(aes(size=score))+
   ggtitle(colnames(case.count.matrix)[i])+    
   #geom_text(data=subset(result.df, pVal < 0.05 & log2(abundance) > xInt),hjust=.65, vjust=-1.2,size=2.5)+
   geom_text(data=subset(result.df, pVal < 0.05),hjust=.65, vjust=-1.2,size=2.5)+
   geom_hline(aes(yintercept=.95), colour="blue", linetype="dashed")+
   #geom_vline(aes(xintercept=xInt), colour="blue", linetype="dashed")+
   #geom_text(data=subset(result.df, pVal > 0.05 & log2(abundance) > xInt),alpha=.5,hjust=.65, vjust=-1.2,size=2)+
   #geom_text(data=subset(result.df, pVal < 0.05 & log2(abundance) < xInt),alpha=.5,hjust=.65, vjust=-1.2,size=2)+
   theme_classic() + 
   scale_size(range=c(2,12),expand=c(2,0),breaks=c(0,1,10,100,1000,1000000),labels=c(">=0",">=1",">=10",">=100",">=1000",">=1000000"),guide="legend")

As you can see, the breaks are introduced and labeled as intendet. However the point size in the legend does not reflect the point sizes in the plot. Any idea how to fix this?


Solution

  • As @Roman mentioned, if you use scale_size you can specify the limits on size..

    Following is the example how to control size of points

    result.df = read.table(text = 'orgaName                  abundance          pVal         score        
    A                        3          9.998622e-01     1.795338e-04
    B                        2          9.999790e-01     1.823428e-05
    C                        1          2.225074e-308    3.076527e+02
    D                        1          3.510957e-01     4.545745e-01
    E                        3          2.510957e-01     2.545745e+00
    F                        3          1.510957e-02     2.006527e+02
    G                        2          5.510957e-01     3.545745e-02', header = T)
    
    library(ggplot2)
    ggplot(result.df, aes(log2(abundance), (1-pVal), label=orgaName)) +
      ylab("1 - P-Value")+
      xlab("log2(abundance)")+
      geom_point(aes(size=score))+
      #ggtitle(colnames(case.count.matrix)[i])+
      geom_text(data=subset(result.df, pVal < 0.05),hjust=.65, vjust=-1.2,size=2.5)+       
      geom_hline(aes(yintercept=.95), colour="blue", linetype="dashed")+
      theme_classic() + 
      scale_size(range = c(2,12))
    

    Output graph is enter image description here