Search code examples
rggplot2geomgeom-tile

How to switch from geom_point to geom_bin2d() or geom_tile?


I want to plot the yield (mafruit) of different grids in a region. At first I used geom_point, as shown below: enter image description here

But I want to change to geom_tile or geom_bin2d() to fill the entire area, but the former cannot render the image, and the latter does not represent the color with median_mafruit:

enter image description here

library(ggplot2)
library(dplyr)
library(viridis)
library(grid)

p1<-
  ggplot(merged_data, aes(x = longitude, y = latitude, color = median_mafruit)) +
  geom_bin2d() + 
  scale_color_viridis_c(
    limits = c(0, 4),
    breaks = seq(0, 4, by = 0.5),
    labels = scales::number_format(accuracy = 0.1)
  ) +
  labs(
    title = "MG0000-Tiguan",
    subtitle = "Historical climate (1980-2010)",
    x = "Longitude",
    y = "Latitude",
    color = expression("Soybean yield \n(t ha"^-1*")")
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 14, face = "italic"),
    legend.title = element_text(size = 12),
    legend.text = element_text(size = 10)
  )
p1

Part of my source code is below:

merged_data <- structure(list(Order = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20), latitude = c(43.2143, 43.3697, 
43.3909, 43.3926, 43.3961, 43.3978, 43.066, 43.2215, 43.368, 
43.435, 43.4434, 43.4623, 43.4895, 43.4856, 43.3738, 43.4761, 
43.5102, 43.5118, 43.5062, 43.4933), longitude = c(-4.479, -4.4804, 
-4.4606, -4.4484, -4.4241, -4.412, -4.1046, -4.1049, -4.1031, 
-4.0818, -4.021, -3.9502, -3.8184, -3.7798, -3.7279, -3.7147, 
-3.5971, -3.5849, -3.5585, -3.5177), median_mafruit = c(2.73, 
1.095, 1.115, 2.73, 0.527, 0.527, 0.962, 1.039, 1.039, 2.73, 
2.73, 2.73, 2.73, 2.73, 0.544, 2.73, 2.73, 2.73, 0.478, 2.73)), row.names = c(NA, 
-20L), spec = structure(list(cols = list(Order = structure(list(), class = c("collector_double", 
"collector")), latitude = structure(list(), class = c("collector_double", 
"collector")), longitude = structure(list(), class = c("collector_double", 
"collector")), median_mafruit = structure(list(), class = c("collector_double", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x0000024c6289c0f0>, class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))

Any suggestions are welcome! Thanks in advance!


Solution

  • Here is an option using stat_summary_2d which maps the mean (the default, but you can set the summary function via the fun= parameter) value of median_mafruit per bin on the fill aes:

    library(ggplot2)
    
    ggplot(merged_data, aes(x = longitude, y = latitude, z = median_mafruit)) +
      stat_summary_2d() +
      scale_fill_viridis_c(
        limits = c(0, 4),
        breaks = seq(0, 4, by = 0.5),
        labels = scales::number_format(accuracy = 0.1)
      ) +
      labs(
        title = "MG0000-Tiguan",
        subtitle = "Historical climate (1980-2010)",
        x = "Longitude",
        y = "Latitude",
        fill = expression("Soybean yield \n(t ha"^-1 * ")")
      ) +
      theme_minimal() +
      theme(
        plot.title = element_text(size = 16, face = "bold"),
        plot.subtitle = element_text(size = 14, face = "italic"),
        legend.title = element_text(size = 12),
        legend.text = element_text(size = 10)
      )