Search code examples
rggplot2gridmappinglegend

How to fix the value of a specific color of Legend?


I try to use two sets of data Data1 and Data2 to draw the yield (median_mafruit) in the same area. I want to fix the colors between different value areas for legend. My code is as follows:

library(ggplot2)
library(dplyr)
library(readr)

color_breaks <- c(0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0)
colors <- c("#E31A1C", "#FF7F00","#FDBF6F","#E9E4A6", "#A4D4A9", "#B2DF8A", "#33A02C", "#1F78B4")

p1<-ggplot(merged_data, aes(x = longitude, y = latitude, z = median_mafruit)) +
  stat_summary_2d(bins = 30) + 
  scale_fill_gradientn(
    colors = colors,
    values = scales::rescale(color_breaks),
    breaks = color_breaks,
    labels = scales::number_format(accuracy = 0.1)
  ) +
  labs(
    title = "Data2",
    x = "Longitude",
    y = "Latitude",
    fill = expression("Amount")
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 14, face = "italic"),
    legend.title = element_text(size = 12),
    legend.text = element_text(size = 10)
  )
p1

But the color doesn't match my expected setting, even though the value is not 3.5, it still uses blue (see the figure below). enter image description here

I hope no matter based on which dataset, all colors follows the definition as below:

[0, 0.5) → "#E31A1C",[0.5, 1) → "#FF7F00",[1, 1.5) → "#FDBF6F",[1.5, 2) → "#E9E4A6",[2, 2.5) → "#A4D4A9",[2.5, 3) → "#B2DF8A",[3, 3.5) → "#33A02C"****[3.5, 4) → "#1F78B4"

Even if median_mafruit does not reach 3 in the original data, I also hope to keep these two values ​​[3, 3.5), [3.5, 4) in Legend, and the corresponding colors "#33A02C", "#1F78B4".

The ideal Legend is similar to the following figure:

enter image description here

Even if the original data is different, I want the Legend of Data1 and Data2 to be consistent.

Part of Data2 is shown below:

merged_data <- structure(list(Order = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20), latitude = c(43.2143, 43.3697, 
43.3909, 43.3926, 43.3961, 43.3978, 43.066, 43.2215, 43.368, 
43.435, 43.4434, 43.4623, 43.4895, 43.4856, 43.3738, 43.4761, 
43.5102, 43.5118, 43.5062, 43.4933), longitude = c(-4.479, -4.4804, 
-4.4606, -4.4484, -4.4241, -4.412, -4.1046, -4.1049, -4.1031, 
-4.0818, -4.021, -3.9502, -3.8184, -3.7798, -3.7279, -3.7147, 
-3.5971, -3.5849, -3.5585, -3.5177), median_mafruit = c(2.73, 
1.095, 1.115, 2.73, 0.527, 0.527, 0.962, 1.039, 1.039, 2.73, 
2.73, 2.73, 2.73, 2.73, 0.544, 2.73, 2.73, 2.73, 0.478, 2.73)), row.names = c(NA, 
-20L), spec = structure(list(cols = list(Order = structure(list(), class = c("collector_double", 
"collector")), latitude = structure(list(), class = c("collector_double", 
"collector")), longitude = structure(list(), class = c("collector_double", 
"collector")), median_mafruit = structure(list(), class = c("collector_double", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x0000024c6289c0f0>, class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))

Any suggestions are welcome! Thanks in advance!


Solution

  • The issue is the different range of the data. To ensure you get the same legends for both plots (and (!!) the correct assignment of colors according to your bins) you have to fix the limits=:

    library(ggplot2)
    
    color_breaks <- c(0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0)
    colors <- c("#E31A1C", "#FF7F00", "#FDBF6F", "#E9E4A6", "#A4D4A9", "#B2DF8A", "#33A02C", "#1F78B4")
    
    p1 <- ggplot(merged_data, aes(x = longitude, y = latitude, z = median_mafruit)) +
      labs(
        title = "Data2",
        x = "Longitude",
        y = "Latitude",
        fill = expression("Amount")
      ) +
      theme_minimal() +
      theme(
        plot.title = element_text(size = 16, face = "bold"),
        plot.subtitle = element_text(size = 14, face = "italic"),
        legend.title = element_text(size = 12),
        legend.text = element_text(size = 10)
      )
    
    p1 +
      stat_summary_2d(bins = 30) +
      scale_fill_gradientn(
        colors = colors,
        values = scales::rescale(color_breaks),
        breaks = color_breaks,
        labels = scales::number_format(accuracy = 0.1),
        limits = range(color_breaks)
      )
    

    A second option which more resembles the image of what you are trying to achieve would be to bin the values manually (or automatically using scale_fill_binned):

    p1 +
      stat_summary_2d(
        aes(fill = after_stat(
          cut(
            value,
            breaks = c(0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0)
          )
        )),
        bins = 30,
        show.legend = TRUE
      ) +
      scale_fill_manual(
        values = colors,
        drop = FALSE,
        guide = guide_legend(reverse = TRUE)
      )