Search code examples
rhexagonal-tiles

Find coordinates for overlapping hexagonal bins between hexbin objects


I have two spatial datasets with coordinates indicating observations of a species and want to estimate the area of overlap among these datasets. Since point coordinates cannot represent an area, one has to bin the coordinates using similar x (longitude) and y (latitude) categories for both datasets.

For this task, I found the practical hexbin package, which does hexagonal binning. The package is great, but at least I fail to find a function that directly outputs the coordinates / IDs of overlapping bins among hexbin objects. For example, the hdiffplot returns a nice graphical overview of overlapping bins, but how to extract this information for further analysis?

library(hexbin)

set.seed(1); df1 <- data.frame(x = rnorm(10, 0, 5), y = rnorm(10, 0, 5))
set.seed(2); df2 <- data.frame(x = rnorm(10, 0, 5), y = rnorm(10, 0, 5))

xrange <- c(floor(min(c(df1$x, df2$x))-1), ceiling(max(c(df1$x, df2$x))+1))
#-/+1 just to make the plot nicer
yrange <- c(floor(min(c(df1$y, df2$y))-1), ceiling(max(c(df1$y, df2$y)))+1)

hb1 <- hexbin(df1$x, df1$y, xbins = 10, xbnds = xrange, ybnds = yrange)
hb2 <- hexbin(df2$x, df2$y, xbins = 10, xbnds = xrange, ybnds = yrange)

hdiffplot(hb1,hb2, xbnds = xrange, ybnds = yrange)

enter image description here


Solution

  • I figured out a solution to this problem while making the question. Will post it here in hopes that it will help someone one day.

    You can extract the coordinates using the hcell2xy function. Here is a little function to find the unique and overlapping coordinates for bin centroids:

    #' @title Print overlapping and unique bin centroid coordinates for two hexbin objects
    #' @param bin1,bin2 two objects of class hexbin.
    #' @details The hexbin objects for comparison, bin1 and bin2, must have the same plotting limits and cell size.
    #' @return Returns a list of data frames with unique coordinates for \code{bin1} and \code{bin2} as well as overlapping coordinates among bins.
    
    hdiffcoords <- function(bin1, bin2) {
    
      ## Checks modified from: https://github.com/edzer/hexbin/blob/master/R/hdiffplot.R
      if(is.null(bin1) | is.null(bin1)) {
        stop("Need 2 hex bin objects")
      } else {
            if(bin1@shape != bin2@shape)
                stop("Bin objects must have same shape parameter")
            if(all(bin1@xbnds == bin2@xbnds) & all(bin1@ybnds == bin2@ybnds))
                equal.bounds <- TRUE
            else stop("Bin objects need the same xbnds and ybnds")
            if(bin1@xbins != bin2@xbins)
                stop("Bin objects need the same number of bins")
      }
    
      ## Find overlapping and unique bins
    
      hd1 <- data.frame(hcell2xy(bin1), count_bin1 = bin1@count, cell_bin1 = bin1@cell)
      hd2 <- data.frame(hcell2xy(bin2), count_bin2 = bin2@count, cell_bin2 = bin2@cell)
    
      overlapping_hd1 <- apply(hd1, 1, function(r, A){ sum(A$x==r[1] & A$y==r[2]) }, hd2)
      overlapping_hd2 <- apply(hd2, 1, function(r, A){ sum(A$x==r[1] & A$y==r[2]) }, hd1)
    
      overlaps <- merge(hd1[as.logical(overlapping_hd1),], hd2[as.logical(overlapping_hd2),])
    
      unique_hd1 <- hd1[!as.logical(overlapping_hd1),]
      unique_hd2 <- hd2[!as.logical(overlapping_hd2),]
    
      ## Return list of data.frames
    
      list(unique_bin1 = unique_hd1, unique_bin2 = unique_hd2, overlapping = overlaps)
    
    }
    

    This information should be the same than returned by hdiffplot in graphical format:

    df <- hdiffcoords(hb1, hb2)
    
    library(ggplot2)
    
    ggplot() + 
      geom_point(data = df$unique_bin1, aes(x = x, y = y), color = "red", size = 10) + 
      geom_point(data = df$unique_bin2, aes(x = x, y = y), color = "cyan", size = 10) +
      geom_point(data = df$overlapping, aes(x = x, y = y), color = "green", size = 10) + theme_bw() 
    

    enter image description here

    Any comments/corrections are appreciated.