I am creating a scatterplot using ggplot2 and have different shapes and colours to differentiate "Sites" in my data.
This is the code that I used to try and produce it (see Scatterplot 1 link for image):
all <- ggplot(data = all_quantity, aes(x=Distance, y=Quantity, shape=Site, colour=Site)) +
geom_point(size=3) + #size of points
scale_y_continuous(limits = c(0, 80), breaks = c(0, 10, 20, 30, 40, 50, 60, 70, 80)) + #y-axis interval
scale_x_continuous(limits = c(0, 30), breaks = c(0, 5, 10, 15, 20, 25, 30)) + #x-axis interval
labs(y="Quantity (pg/mL)", x="Downstream Distance (km)")
all
I would like Belton, Bridgeport, and Canyon to have a black outline around it since it looks more like a mess of colours without it and it's hard to tell that there's multiple samples.
I have also tried this code but I would still like Lewisville, Stillhouse, and Worth to have the colour from the previous plot instead of black. The way Belton, Bridgeport, and Canyon look are what I'm aiming for though (see Scatterplot 2 link for image):
all <- ggplot(data = all_quantity, aes(x=Distance, y=Quantity, shape=Site, fill=Site)) +
geom_point(size=3, colour="black") +
scale_shape_manual(values = c(21, 24, 22, 3, 7, 8)) +
scale_y_continuous(limits = c(0, 80), breaks = c(0, 10, 20, 30, 40, 50, 60, 70, 80)) + #y-axis interval
scale_x_continuous(limits = c(0, 30), breaks = c(0, 5, 10, 15, 20, 25, 30)) + #x-axis interval
labs(y="Quantity (pg/mL)", x="Downstream Distance (km)")
all
Instead of using color="black"
map Site
on the color aesthetic. Then use scale_color_manual
to set the colors:
Using some fake data:
library(ggplot2)
library(scales)
all_quantity <- data.frame(
Distance = 1:6,
Quantity = 1:6,
Site = c("Belton", "Bridgeport", "Canyon", "Lewisville", "Stillhouse", "Worth")
)
pal <- scales::hue_pal()(6)
pal[1:3] <- "black"
all <- ggplot(data = all_quantity, aes(x=Distance, y=Quantity, shape=Site, fill=Site, color = Site)) +
geom_point(size=3) +
scale_shape_manual(values = c(21, 24, 22, 3, 7, 8)) +
scale_color_manual(values = pal) +
scale_y_continuous(limits = c(0, 80), breaks = c(0, 10, 20, 30, 40, 50, 60, 70, 80)) + #y-axis interval
scale_x_continuous(limits = c(0, 30), breaks = c(0, 5, 10, 15, 20, 25, 30)) + #x-axis interval
labs(y="Quantity (pg/mL)", x="Downstream Distance (km)")
all