Search code examples
rplotscatter-plot

R:Smooth scatter plot creating specific color for values


Hello I have created smooth scatter plot colored based on the data category,now I am not getting which color represent which category so for understanding if I pass specific color to each category it will look more convenient according to that how can I do this col=df$cat its not color specific points it belongs I am not understanding any one suggest me

specific color to each category for example data1=green, data2=blue, data3=black, data4=red

my data

   A    B   cat
 0.8803 0.0342  data1
 0.9174 0.0331  data1
 0.9083 0.05    data1
 0.7542 0.161   data2
 0.8983 0.0593  data2
 0.8182 0.1074  data2
 0.3525 0.3525  data3
 0.5339 0.2288  data3
 0.7295 0.082   data3
 0.8    0.0417  data4
 0.8291 0.0598  data4
 0.8    0.025   data4

script which I have tried

df=read.table("test.txt", sep='\t', header=TRUE)
smoothScatter(df$B,df$A,,nrpoints=Inf,xlim=c(0,1),ylim=c(0,1), pch=20,cex=1, col=df$cat)
points(c(0,1),c(1,0),type='l',col='green',lty=2,lwd=2)
legend(0.30, 1, c("diagonal"), 
       col = c("green"), 
       lty = 2, lwd = 4) 

Thank you


Solution

  • First create the colors vector as a named vector. Then plot with x=A, y=B and, in geom_point only, color=cat. The diagonal line is drawn by a geom_line call with a new data argument, the end points of the line.

    library(ggplot2)
    
    color <- c(data1="green", data2="blue", data3="black", data4="red")
    
    ggplot(df, aes(A, B)) +
      geom_point(aes(color = cat), alpha = 0.25) +
      geom_line(
        data = data.frame(A = 0:1, B = 1:0),
        mapping = aes(A, B),
        color = "green",
        linetype = "dashed"
      ) +
      scale_color_manual(values = color) +
      theme_bw()
    

    enter image description here


    Base R

    With smoothScatter the graph is as follows.

    colnum <- match(df$cat, names(color))
    
    smoothScatter(
      df$B, df$A,
      nrpoints = Inf,
      xlim = c(0, 1), ylim = c(0, 1), 
      pch = 20, cex = 1, 
      col = color[colnum]
    )
    points(c(0, 1), c(1, 0), type = 'l', col = 'green', lty = 2, lwd = 2)
    legend(0.30, 1, "diagonal", col = "green", lty = 2, lwd = 4) 
    

    enter image description here

    Or, if we remove nrpoints = Inf

    smoothScatter(
      df$B, df$A,
      #nrpoints = Inf,
      xlim = c(0, 1), ylim = c(0, 1), 
      pch = 20, cex = 1, 
      col = color[colnum]
    )
    points(c(0, 1), c(1, 0), type = 'l', col = 'green', lty = 2, lwd = 2)
    legend(0.30, 1, "diagonal", col = "green", lty = 2, lwd = 4) 
    

    enter image description here

    Data

    google_id <- "1FN0H7ilY9OwcQC7Y8iiLUZd2CTuJ9U4Z"
    google_file <- sprintf("https://docs.google.com/uc?id=%s&export=download", google_id)
    df <- read.table(google_file, header = TRUE)