Search code examples
rplotggplot2heatmaplevelplot

Contour plot or heatmap from three continuous variables


I have a model which has told me there is an interaction between two variables: a and b, which is significantly influencing my response variable: c. All three are continuous numeric variables. For detail c is the rate in change my response variable, b is the rate of change in my predictor and a is mean annual rainfall. The unit of analysis is pixels in a raster. So my model is telling me mean annual rainfall modifies how my predictor affects my response.

To visualise this interaction I would like to use a contour plot/heat map/level plot with a and b on the x and y axes and c providing the colour to show me how my response variable changes within the space described by a and b. I can do this with a scatter plot but its not very pretty or easy to interpret:

qplot(b, a, colour = c) +
  scale_colour_gradient(low="green", high="red") +

enter image description here

When I try to plot a contour plot/heat map/level plot though all I get is errors, blank plots or ugly plots.

geom_contour gives me an error:

ggplot(data = Mod, aes(x = Rain, y = Bomas, z = Fire)) +
  geom_contour()

Warning message:
Not possible to generate contour data

geom_raster initially gives me Error: cannot allocate vector of size 81567.2 Gb but when I round my data it produces:

ggplot(data = df, aes(x = a, y = b, z = c)) +
  geom_raster(aes(fill = c))

enter image description here

Adding interpolate = TRUE to the geom_raster code just makes the lines a little blurry.

geom_tile produces a blank graph but with a scale bar for c:

ggplot(data = df, aes(x = a, y = b, z = c)) +
  geom_tile(aes(color = c))

enter image description here

I've also tried using stat_density2d and setting the fill and/or the colour to c, but just got an error, and I've tried using levelplot in the lattice package as well but that produces this:

levelplot(c ~ a * b, data = df,
          aspect = "asp", contour = TRUE,
          xlab = "a",
          ylab = "b")

enter image description here

I suspect the problems I'm encountering are because the functions are not set up to deal with continuous x and y variables, all the examples seem to use factors. I would have thought I could compensate for that by changing bin widths but that doesn't seem to work either. Is there a function that allows you to make a heat map with 3 continuous variables? Or do I need to treat my a and b variables as factors and manually make a dataframe with bins appropriate for my data?

If you want to experiment for yourself then you get similar problems to what I'm having with:

df<- as.data.frame(rnorm(1:1068))
df[,2] <- rnorm(1:1068)
df[,3] <- rnorm(1:1068)
names(df) <- c("a", "b", "c")

Solution

  • You can get automatic bins, and for example calculate the means by using stat_summary_2d:

    ggplot(df, aes(a, b, z = c)) +
      stat_summary_2d() +
      geom_point(shape = 1, col = 'white') +
      viridis::scale_fill_viridis()
    

    enter image description here

    Another good option is to slice your data by the third variable, and plot small multiples. This doesn't really show very well for random data though:

    library(ggplot2)
    ggplot(df, aes(a, b)) +
      geom_point() +
      facet_wrap(~cut_number(c, 4))
    

    enter image description here