Search code examples
rcolorslegendscatter-plotbubble-chart

How to add a legend after updating a scatter plot with the symbols() command in r


I've created a bubble chart / scatter plot in R using the following data:

View my_data_set

and following code:

my_data_set <- read.csv("c:/Users/Person/Desktop/my_data_set.csv")

View(my_data_set)

plot(my_data_set$Analysis_Vs_Presentation, my_data_set$Flexibility)

IScolors <- c("#e6f598", "#66c2a5")

TypeLevels <- as.numeric(my_data_set$Type)

symbols(my_data_set$Analysis_Vs_Presentation, my_data_set$Flexibility, circles=sqrt(my_data_set$Easiness), inches=0.8, bg = IScolors[TypeLevels], fg="black", xlab="Presentation", ylab="Flexibility", main="Comparison of 5 Data Analytics Tools", xlim=c(0, 11), ylim=c(0, 11))

text(my_data_set$Analysis_Vs_Presentation, my_data_set$Flexibility, my_data_set$Tool, cex=1)

which gives me a bubble chart scatter plot with differently sized bubbles depending on the value of Easiness, and a bubble colour depending on the value of Type.

pic of my bubble scatter plot chart

I want to add a legend to show what the colour of the bubble means. I tried using this:

legend("bottomright", legend=my_data_set$Type, col=IScolors, cex=0.75)

and that displayed a legend in the bottom right, but it just listed the 5 values of the Type attribute.

How do I ask it to display something that lists the 2 distinct values of the Type attribute, and the associated colour used in the chart?

UPDATE: Chris - after I tried your suggestion I see a legend but it shows all 5 values rather than just the 2 distinct values:

screenshot of plot with added legend


Solution

  • Ok, I've taken the trouble of recreating your code to see how it might work. Here's the solution--quite simple, iff what you are after is just the two colors for the two types, right? This is effectively your code; the changed bit follows below:

    df <- data.frame(
      Tool = c("R", "GGPlot2", "Tableau", "D3", "Excel"),
      Flex = c(6,8,7,10,2),
      Type = c("static", "static", "interactive", "interactive", "static"),
      Easi = c(6,5,10,1,7),
      Ana_v_Pres = c(1,2,5,10,3)
    )
    View(df)
    
    plot(df$Ana_v_Pres, df$Flex)
    IScolors <- c("#e6f598", "#66c2a5")
    TypeLevels <- as.numeric(df$Type)
    symbols(df$Ana_v_Pres, df$Flex, circles=sqrt(df$Easi), inches=0.8, 
        bg = IScolors[TypeLevels], fg="black", xlab="Presentation", 
        ylab="Flexibility", main="Comparison of 5 Data Analytics Tools", 
        xlim=c(0, 11), ylim=c(0, 11))
    
    text(df$Ana_v_Pres, df$Flex, df$Tool, cex=1)
    

    Now for the change: it's just that you define the two labels to be shown in the legend key and assign the col and fill arguments to it:

    legend("bottomright", c("static", "interactive"), col=IScolors, fill=IScolors, cex=0.75)