Search code examples
rheatmaptensorgammgcv

Using gratia::smooth_estimates() with a tensor to produce a heatmap for a defined area (e.g., US state lines) in R


I have a data set that looks like this:

  structure(list(count = c(85970L, 57321L, 46409L, 30436L, 47316L, 
2037L, 82188L, 127510L, 97109L, 15328L, 37766L, 74320L, 197130L, 
73258L, 27795L, 52434L, 124762L, 16471L, 22938L, 23192L, 269932L, 
69908L, 22480L, 180249L, 1771L, 66598L, 39182L, 46038L, 61145L, 
152175L, 1399L, 134424L, 43606L, 2221L, 43559L, 11684L, 12566L, 
180520L, 101116L, 69597L, 13915L, 7268L, 3427L, 96646L, 46133L, 
12031L, 13757L, 59464L, 32686L, 130545L, 322294L, 79400L, 148609L, 
100379L, 137641L, 461L, 162990L, 141629L, 8291L, 47033L, 167593L, 
232321L, 59686L, 64641L, 273107L, 138242L, 3891L, 139969L, 105676L, 
219271L, 24154L, 236896L, 44177L, 78308L, 72062L, 1708L, 106882L, 
53302L, 104267L, 123396L, 9971L, 82740L, 98013L, 17759L, 2338L, 
210581L, 77454L, 89777L, 304L, 303330L, 11536L, 13508L, 65698L, 
65678L, 168244L, 137721L, 128938L, 45270L, 12065L, 4788L, 29950L, 
84208L, 2938L, 2226L, 288115L, 34381L, 29622L, 32731L, 81229L, 
29917L, 52117L, 51471L, 55578L, 72827L, 4353L, 117578L, 41764L, 
108096L, 4978L, 24245L, 64693L, 95385L, 39591L, 18504L, 65459L, 
81650L, 248L, 55279L, 30344L, 84002L, 44458L, 48848L, 10242L, 
143527L, 5219L, 5684L, 124091L, 14563L, 22661L, 24415L, 78820L, 
5137L, 30521L, 30628L, 85572L, 3111L, 3772L, 88998L, 15453L, 
4040L, 38177L, 5006L, 1378L, 28548L, 97929L, 75099L, 58L, 32857L, 
3911L, 7545L, 4066L, 57735L, 11788L, 8526L, 66760L, 29170L, 92913L, 
53435L, 5395L, 35459L, 21655L, 3467L, 27090L, 16361L, 57477L, 
38412L, 126957L, 20936L, 14252L, 74194L, 51816L, 11702L, 14601L, 
21749L, 975L, 11135L, 1241L, 7758L, 39140L, 12575L, 37743L, 19391L, 
30211L, 38409L, 12963L, 102396L, 18964L, 110808L, 49L, 9085L, 
6728L, 66610L, 16165L, 3075L, 17844L, 49640L, 47995L, 16246L, 
59423L, 17876L, 75570L, 1384L, 10572L, 41492L, 51660L), Longitude = c(-95.53558, 
-94.08706, -93.5671, -98.24354, -95.53558, -96.09736, -94.08706, 
-93.5671, -101.05749, -96.06779, -98.24354, -95.53558, -94.08706, 
-93.5671, -98.24354, -95.53558, -94.08706, -93.5671, -98.24354, 
-95.53558, -94.08706, -93.5671, -98.24354, -95.53558, -96.09736, 
-94.08706, -93.5671, -95.53558, -95.53558, -94.08706, -98.24354, 
-93.5671, -95.53558, -96.09736, -94.08706, -96.06779, -98.24354, 
-95.53558, -93.5671, -95.53558, -95.53558, -96.96682, -96.09736, 
-94.08706, -96.57169, -93.5671, -95.53558, -96.57169, -99.1598, 
-95.53558, -94.08706, -93.5671, -101.05749, -98.24354, -95.53558, 
-96.09736, -94.08706, -93.5671, -96.06779, -98.24354, -95.53558, 
-94.08706, -93.5671, -98.24354, -95.53558, -94.08706, -95.90206, 
-93.5671, -98.24354, -95.53558, -97.0545, -93.5671, -98.24354, 
-99.1598, -95.53558, -96.09736, -94.08706, -93.5671, -101.05749, 
-95.53558, -96.96682, -93.5671, -95.53558, -98.24354, -95.90206, 
-93.5671, -101.05749, -95.53558, -96.09736, -94.08706, -96.06779, 
-98.24354, -99.1598, -95.53558, -93.5671, -95.53558, -101.05749, 
-98.24354, -97.0545, -95.90206, -99.1598, -95.53558, -96.96682, 
-96.09736, -94.08706, -96.57169, -98.24354, -96.57169, -93.5671, 
-95.53558, -95.53558, -94.08706, -93.5671, -95.53558, -96.09736, 
-94.08706, -93.5671, -101.05749, -96.06779, -98.24354, -95.53558, 
-94.08706, -93.5671, -98.24354, -95.53558, -94.08706, -95.90206, 
-93.5671, -98.24354, -95.53558, -94.08706, -93.5671, -98.24354, 
-95.53558, -97.0545, -96.09736, -94.08706, -93.5671, -96.06779, 
-98.24354, -95.53558, -96.96682, -93.5671, -95.53558, -94.08706, 
-98.24354, -95.90206, -93.5671, -95.53558, -96.09736, -94.08706, 
-96.06779, -98.24354, -95.53558, -93.5671, -95.53558, -95.90206, 
-95.53558, -96.96682, -97.0545, -96.09736, -94.08706, -96.06779, 
-99.1598, -96.57169, -93.5671, -95.53558, -96.57169, -95.53558, 
-94.08706, -93.5671, -98.24354, -99.1598, -95.53558, -94.08706, 
-93.5671, -101.05749, -98.24354, -95.53558, -94.08706, -93.5671, 
-98.24354, -95.53558, -94.08706, -95.90206, -93.5671, -98.24354, 
-95.53558, -93.5671, -98.24354, -95.53558, -94.08706, -93.5671, 
-99.1598, -95.53558, -101.05749, -95.53558, -94.08706, -95.90206, 
-93.5671, -95.53558, -94.08706, -101.05749, -98.24354, -95.53558, 
-93.5671, -99.1598, -95.53558, -101.05749, -98.24354, -94.08706, 
-96.57169, -96.57169, -93.5671, -98.24354), Latitude = c(32.80818, 
31.06582, 31.17619, 28.4825, 32.80818, 31.95728, 31.06582, 31.17619, 
29.45031, 32.18075, 28.4825, 32.80818, 31.06582, 31.17619, 28.4825, 
32.80818, 31.06582, 31.17619, 28.4825, 32.80818, 31.06582, 31.17619, 
28.4825, 32.80818, 31.95728, 31.06582, 31.17619, 32.80818, 32.80818, 
31.06582, 28.4825, 31.17619, 32.80818, 31.95728, 31.06582, 32.18075, 
28.4825, 32.80818, 31.17619, 32.80818, 32.80818, 33.06952, 31.95728, 
31.06582, 33.81821, 31.17619, 32.80818, 33.81821, 26.55788, 32.80818, 
31.06582, 31.17619, 29.45031, 28.4825, 32.80818, 31.95728, 31.06582, 
31.17619, 32.18075, 28.4825, 32.80818, 31.06582, 31.17619, 28.4825, 
32.80818, 31.06582, 32.82516, 31.17619, 28.4825, 32.80818, 33.35353, 
31.17619, 28.4825, 26.55788, 32.80818, 31.95728, 31.06582, 31.17619, 
29.45031, 32.80818, 33.06952, 31.17619, 32.80818, 28.4825, 32.82516, 
31.17619, 29.45031, 32.80818, 31.95728, 31.06582, 32.18075, 28.4825, 
26.55788, 32.80818, 31.17619, 32.80818, 29.45031, 28.4825, 33.35353, 
32.82516, 26.55788, 32.80818, 33.06952, 31.95728, 31.06582, 33.81821, 
28.4825, 33.81821, 31.17619, 32.80818, 32.80818, 31.06582, 31.17619, 
32.80818, 31.95728, 31.06582, 31.17619, 29.45031, 32.18075, 28.4825, 
32.80818, 31.06582, 31.17619, 28.4825, 32.80818, 31.06582, 32.82516, 
31.17619, 28.4825, 32.80818, 31.06582, 31.17619, 28.4825, 32.80818, 
33.35353, 31.95728, 31.06582, 31.17619, 32.18075, 28.4825, 32.80818, 
33.06952, 31.17619, 32.80818, 31.06582, 28.4825, 32.82516, 31.17619, 
32.80818, 31.95728, 31.06582, 32.18075, 28.4825, 32.80818, 31.17619, 
32.80818, 32.82516, 32.80818, 33.06952, 33.35353, 31.95728, 31.06582, 
32.18075, 26.55788, 33.81821, 31.17619, 32.80818, 33.81821, 32.80818, 
31.06582, 31.17619, 28.4825, 26.55788, 32.80818, 31.06582, 31.17619, 
29.45031, 28.4825, 32.80818, 31.06582, 31.17619, 28.4825, 32.80818, 
31.06582, 32.82516, 31.17619, 28.4825, 32.80818, 31.17619, 28.4825, 
32.80818, 31.06582, 31.17619, 26.55788, 32.80818, 29.45031, 32.80818, 
31.06582, 32.82516, 31.17619, 32.80818, 31.06582, 29.45031, 28.4825, 
32.80818, 31.17619, 26.55788, 32.80818, 29.45031, 28.4825, 31.06582, 
33.81821, 33.81821, 31.17619, 28.4825), offset = c(134328, 50282, 
46878, 57535.7, 100672, 4851, 70913, 55199, 121690, 15688.6, 
59288, 124281, 148777, 57957, 34830.9, 100834, 102516, 16389, 
34754.9, 56566, 197031, 51976, 26140, 130615, 1265, 58063, 31598, 
86864, 83759.9, 80516, 3515, 80735, 118817.9, 8541, 36604, 14642, 
10385, 311241.8, 65447, 92796, 36618, 18682.6, 9520, 74343, 62342, 
15057, 60602.9, 77226, 19514, 334731, 441498, 107881, 135099, 
109226.6, 382337, 5120, 235195, 114494, 24031, 75859.8, 478836, 
219171, 76915, 72063, 666114, 134346, 6949, 188637, 108274.8, 
261037, 58912, 230915, 50778.6, 47749, 135966, 5177, 63696, 80761, 
128092, 224357, 22207, 70718, 297008.8, 22282, 1635, 243728, 
140824.7, 272053, 2170, 167493, 16622.5, 14844, 49397, 184488.7, 
141263, 237449.6, 127915.1, 33783.9, 43091, 25200, 85572, 255177, 
17917.5, 7420, 198700, 52093, 38470, 35968, 128730, 57533, 121202, 
91912, 74601, 137409, 17411, 113492, 48226, 136830, 4818.6, 46005, 
143763, 114234, 43175, 21516, 176916, 70388, 2252, 50026, 45289.7, 
155560, 56997, 46969, 35316.6, 146456, 20876, 10526, 95602, 28169, 
35519, 23704, 99898.9, 7275.6, 24693, 69608.5, 66335, 8407, 12572, 
83488, 67187, 8595, 46444, 5193, 10060, 135945, 72540, 111847.9, 
583, 88803, 8409.9, 20392, 11294, 56327, 15113, 22435.6, 71785, 
30803, 244507, 95419, 29974, 48112, 29264, 11955.7, 22957.7, 
68170, 117780, 61955, 208126, 46523.8, 39590, 87287, 32962, 37750, 
39461, 25953, 8128, 19884, 14262.5, 48487, 32918, 20959, 65074, 
48968, 46839, 51903.9, 29462, 153058, 46252.8, 110974, 2439, 
10094, 42050.6, 78828, 57733.9, 7501, 137263.9, 67081, 61532, 
29538.9, 78188.5, 9508.5, 85583, 11530, 42289, 73050, 53812.6
), name = structure(c(5L, 9L, 12L, 3L, 5L, 8L, 9L, 12L, 1L, 2L, 
3L, 5L, 9L, 12L, 3L, 5L, 9L, 12L, 3L, 5L, 9L, 12L, 3L, 5L, 8L, 
9L, 12L, 5L, 5L, 9L, 3L, 12L, 5L, 8L, 9L, 2L, 3L, 5L, 12L, 5L, 
5L, 6L, 8L, 9L, 11L, 12L, 5L, 11L, 4L, 5L, 9L, 12L, 1L, 3L, 5L, 
8L, 9L, 12L, 2L, 3L, 5L, 9L, 12L, 3L, 5L, 9L, 10L, 12L, 3L, 5L, 
7L, 12L, 3L, 4L, 5L, 8L, 9L, 12L, 1L, 5L, 6L, 12L, 5L, 3L, 10L, 
12L, 1L, 5L, 8L, 9L, 2L, 3L, 4L, 5L, 12L, 5L, 1L, 3L, 7L, 10L, 
4L, 5L, 6L, 8L, 9L, 11L, 3L, 11L, 12L, 5L, 5L, 9L, 12L, 5L, 8L, 
9L, 12L, 1L, 2L, 3L, 5L, 9L, 12L, 3L, 5L, 9L, 10L, 12L, 3L, 5L, 
9L, 12L, 3L, 5L, 7L, 8L, 9L, 12L, 2L, 3L, 5L, 6L, 12L, 5L, 9L, 
3L, 10L, 12L, 5L, 8L, 9L, 2L, 3L, 5L, 12L, 5L, 10L, 5L, 6L, 7L, 
8L, 9L, 2L, 4L, 11L, 12L, 5L, 11L, 5L, 9L, 12L, 3L, 4L, 5L, 9L, 
12L, 1L, 3L, 5L, 9L, 12L, 3L, 5L, 9L, 10L, 12L, 3L, 5L, 12L, 
3L, 5L, 9L, 12L, 4L, 5L, 1L, 5L, 9L, 10L, 12L, 5L, 9L, 1L, 3L, 
5L, 12L, 4L, 5L, 1L, 3L, 9L, 11L, 11L, 12L, 3L), .Label = c("Ami", 
"Ced", "Cho", "Fal", "For", "Lew", "Ray", "Ric", "Sam", "Taw", 
"Tex", "Tol"), class = "factor")), row.names = c(3L, 6L, 7L, 
9L, 12L, 15L, 16L, 17L, 18L, 20L, 21L, 22L, 24L, 25L, 28L, 30L, 
33L, 34L, 35L, 36L, 39L, 41L, 45L, 46L, 50L, 51L, 52L, 54L, 60L, 
62L, 64L, 68L, 72L, 74L, 75L, 76L, 77L, 78L, 80L, 83L, 89L, 90L, 
92L, 93L, 96L, 97L, 100L, 103L, 109L, 110L, 115L, 116L, 118L, 
121L, 124L, 130L, 131L, 132L, 136L, 137L, 138L, 141L, 142L, 145L, 
147L, 153L, 154L, 155L, 157L, 158L, 161L, 163L, 169L, 170L, 171L, 
174L, 175L, 176L, 178L, 179L, 181L, 185L, 190L, 196L, 203L, 204L, 
206L, 211L, 212L, 213L, 214L, 215L, 216L, 217L, 219L, 224L, 229L, 
232L, 235L, 236L, 239L, 240L, 242L, 244L, 245L, 246L, 247L, 249L, 
250L, 253L, 255L, 259L, 260L, 263L, 267L, 268L, 269L, 270L, 272L, 
273L, 274L, 278L, 279L, 282L, 284L, 287L, 288L, 289L, 290L, 291L, 
295L, 297L, 300L, 301L, 304L, 305L, 306L, 307L, 311L, 312L, 313L, 
314L, 316L, 320L, 323L, 325L, 330L, 331L, 335L, 337L, 338L, 339L, 
340L, 341L, 343L, 346L, 351L, 353L, 354L, 356L, 357L, 358L, 361L, 
362L, 364L, 365L, 368L, 371L, 375L, 377L, 378L, 380L, 382L, 383L, 
385L, 386L, 387L, 390L, 391L, 392L, 393L, 395L, 397L, 400L, 401L, 
402L, 403L, 404L, 407L, 411L, 412L, 415L, 416L, 418L, 419L, 423L, 
426L, 427L, 430L, 431L, 434L, 435L, 436L, 437L, 438L, 439L, 442L, 
443L, 449L, 450L, 452L, 453L, 456L, 457L, 458L), class = "data.frame")

I am running a gam that looks like this:

library(mgcv)
mod1<-gam(count~te(Longitude,Latitude)+offset(log(offset))+s(name,bs="re"),family=nb, data=data)

Now I am trying to do a few things with this tensor as a heatmap and am wondering a few things. My current output looks like this with state lines (data are within the state of Texas):

library(gratia)
library(maps)
library(fields)
library(ggplot2)    
state_map <- map_data("state")
bmap=map_data("state", xlim=c(-102,-95), ylim=c(20,32))

sm <- smooth_estimates(mod1, smooth = "te(Longitude,Latitude)",partial_match=T)

draw(sm) + 
  coord_fixed(ratio = 1) +
  geom_polygon(data=bmap,aes(x=long,y=lat,group=group), inherit.aes=F, 
               colour='black', fill=NA, lwd=1) +
  scale_y_continuous(expand = c(0,0)) + 
  scale_x_continuous(expand = c(0,0)) + 
  theme(panel.background = element_rect(fill = 'white',color="black"),
        legend.key = element_blank()) 

enter image description here

What I am wondering is if there is a way to take this smooth estimate and confine the the predictions within a certain area (e.g., state lines)? For this example, all of my data are within the confines of Texas state lines but the area with the greatest impact on counts (the southeast corner of the state) is not truly represented by this heat map as the area with the highest effect is over the ocean where there were no counts at all.

I am also noticing that the original output from the draw(mod1) command looks different from the smooth_estimate output. See below:

enter image description here

If there is not a way to confine predictions within bounds (e.g., state lines), how do I get the original output that I see with the command draw(mod1) within a smooth estimate so that I can overlay this output with state lines to orient the viewer?


Solution

  • To confine the output from smooth_estimates() to be as per the draw() output, in your call to smooth_estimates(), set the dist argument to the required value. The draw() method for "gam" objects used dist = 0.1 for example.

    If you want to confine the smooth to the state boundary as well / instead, then you'll need to set the values of est in sm where Longitude and Latitude are outside the Texas state boundary to NA. I'm not sure how to do that exactly with data from the map package but there will be a way to treat the Texas boundary as a polygon and then ask if each of your point locations (Longitude and Latitude) at which the smooth was evaluated lies within that polygon. Any that don't should have the respective row in sm set to NA.