Search code examples
rggplot2shapefiler-sfbinning

How can I bin data into hexagons of a shapefile and plot it?


I am new to r and also to this website. I ran into some trouble with my current distribution project. My goal is to create a map with hexagons that have a colour gradient based on different attributes. For example number of records, number of species, rarefaction, etc. in the hexagon. I started with two shapefiles.

One for the hexagons:

Simple feature collection with 10242 features and 4 fields

geometry type: MULTIPOLYGON

dimension: XY

bbox: xmin: -180 ymin: -90 xmax: 180 ymax: 90

CRS: 4326

First 10 features:

 ID CENTRELAT  CENTRELON     AREA                       geometry

 1 -43.06618   41.95708 41583.14 MULTIPOLYGON (((43.50039 -4...

 2 -73.41802 -144.73583 41836.20 MULTIPOLYGON (((-147.695 -7...

 4862 -82.71189  -73.45815 50247.96 MULTIPOLYGON (((-78.89901 -...

 7162  88.01938   53.07438 50258.17 MULTIPOLYGON (((36.63494 87...

 3 -75.32015 -145.44626 50215.61 MULTIPOLYGON (((-148.815 -7...

 4 -77.21239 -146.36437 50225.85 MULTIPOLYGON (((-150.2982 -...

 5 -79.11698 -147.60550 50234.84 MULTIPOLYGON (((-152.3518 -...

 6 -81.03039 -149.37750 50242.49 MULTIPOLYGON (((-155.3729 -...

 7 -82.94618 -152.11105 50248.70 MULTIPOLYGON (((-160.2168 -...

 8 -84.84996 -156.85274 50253.03 MULTIPOLYGON (((-169.0374 -...

And one for the map: geometry type: POLYGON; dimension: XY; bbox: xmin: -180 ymin: -90 xmax: 180 ymax: 83.64513; CRS: 4326

It is the land shapefile from this link: natural earth data

I loaded them with the st_read function. And created a map with this code:

 ggplot() +
  geom_sf(data = hex5) +
  geom_sf(data = land) +
  coord_sf(1, xlim = c(100, 180), ylim = c(0, 90))

The map

I have a data frame that contains species names, longitude and latitude. Roughly 6300 entries.

scientific              lat         lon
1   Acoetes melanonota  11.75690    124.8010
2   Acoetes melanonota  11.97500    102.7350
3   Acoetes melanonota  13.33000    100.9200
4   Acrocirrus muroranensis 42.31400    140.9670
5   Acrocirrus uchidai  43.04800    144.8560
6   Acrocirrus validus  35.30000    139.4830
7   Acutomunna minuta   29.84047    130.9178
8   Admetella longipedata   13.35830    120.5090
9   Admetella longipedata   13.60310    120.7570
10  Aega acuticauda 11.95750    124.1780

How can I bin this data into the hexagons of the map and colour them with a gradient?

Thank you very much!


Solution

  • As I understand it, you have some points and some polygons. You want to summarise the values of the points by the polygon they are in. I made a reproducible example of a possible solution:

    library(sf)
    library(data.table)
    library(dplyr)
    
    # Create an exagonal grid
    sfc = sf::st_sfc(sf::st_polygon(list(rbind(c(0,0), c(1,0), c(1,1), c(0,0)))))
    G = sf::st_make_grid(sfc, cellsize = .1, square = FALSE)
    # Convert to sf object
    G = sf::st_as_sf(data.table(id_hex=1:76, geom=sf::st_as_text(G)), wkt='geom')
    # Create random points on the grid with random value
    n=500
    p = data.table(id_point=1:n, 
                   value = rnorm(n),
                   x=sample(seq(0,1,0.01), n, replace=T),
                   y=sample(seq(0,1,0.01), n, replace=T)
                   )
    p = p[x >= y]
    P = sf::st_as_sf(p, coords=c('x', 'y'))
    # Plot geometry
    plot(sf::st_geometry(G))
    plot(P, add=TRUE)
    
    

    # Join the geometries to associate each polygon to the points it contains
    # Group by and summarise
    J = sf::st_join(G, P, join=sf::st_contains) %>% 
      dplyr::group_by(id_hex) %>% 
      dplyr::summarise(sum_value=sum(value, na.rm=F), 
                       count_value=length(value),
                       mean_value=mean(value, na.rm=F))
    
    plot(J)
    

    # Plot interactive map with mapview package
    mapview::mapview(J, zcol="count_value") + 
      mapview::mapview(P)
    

    enter image description here

    Created on 2020-04-25 by the reprex package (v0.3.0)