Search code examples
rggplot2geometryprojection

ggplot2: Projecting points or distribution on a non-orthogonal (eg, -45 degree) axis


The figure below is a conceptual diagram used by Michael Clark, https://m-clark.github.io/docs/lord/index.html to explain Lord's Paradox and related phenomena in regression.

enter image description here

My question is framed in this context and using ggplot2 but it is broader in terms of geometry & graphing.

I would like to reproduce figures like this, but using actual data. I need to know:

  • how to draw a new axis at the origin, with a -45 degree angle, corresponding to values of y-x
  • how to draw little normal distributions or density diagrams, or other representations of the values y-x projected onto this axis.

My minimal base example uses ggplot2,

library(ggplot2)
set.seed(1234)
N <- 200
group   <- rep(c(0, 1), each = N/2)
initial <- .75*group + rnorm(N, sd=.25)
final   <- .4*initial + .5*group + rnorm(N, sd=.1)
change  <- final - initial
df <- data.frame(id = factor(1:N), 
                group = factor(group, 
                               labels = c('Female', 'Male')), 
                initial, 
                final, 
                change)
#head(df)

#' plot, with regression lines and data ellipses
ggplot(df, aes(x = initial, y = final, color = group)) +
    geom_point() + 
    geom_smooth(method = "lm", formula  =  y~x) +
  stat_ellipse(size = 1.2) +
  geom_abline(slope  =  1, color = "black", size = 1.2) +
  coord_fixed(xlim = c(-.6, 1.2), ylim = c(-.6, 1.2)) +
  theme_bw() +
  theme(legend.position = c(.15, .85)) 

This gives the following graph:

enter image description here

In geometry, the coordinates of the -45 degree rotated axes of distributions I want to portray are (y-x), (x+y) in the original space of the plot. But how can I draw these with ggplot2 or other software?

An accepted solution can be vague about how the distribution of (y-x) is represented, but should solve the problem of how to display this on a (y-x) axis.


Solution

  • Fun question! I haven't encountered it yet, but there might be a package to help do this automatically. Here's a manual approach using two hacks:

    1. the clip = "off" parameter of the coord_* functions, to allow us to add annotations outside the plot area.
    2. building a density plot, extracting its coordinates, and then rotating and translating those.

    First, we can make a density plot of the change from initial to final, seeing a left skewed distribution:

    (my_hist <- df %>%
        mutate(gain = final - initial) %>% # gain would be better name
        ggplot(aes(gain)) +
        geom_density())
    

    enter image description here

    Now we can extract the guts of that plot, and transform the coordinates to where we want them to appear in the combined plot:

    a <- ggplot_build(my_hist)
    rot = pi * 3/4
    diag_hist <- tibble(
      x = a[["data"]][[1]][["x"]],
      y = a[["data"]][[1]][["y"]]
    ) %>%
      # squish
      mutate(y = y*0.2) %>%
      # rotate 135 deg CCW
      mutate(xy = x*cos(rot) - y*sin(rot),
             dens = x*sin(rot) + y*cos(rot)) %>%
      # slide
      mutate(xy = xy - 0.7,  #  magic number based on plot range below
             dens = dens - 0.7)
    

    And here's a combination with the original plot:

    ggplot(df, aes(x = initial, y = final, color = group)) +
      geom_point() + 
      geom_smooth(method = "lm", formula  =  y~x) +
      stat_ellipse(size = 1.2) +
      geom_abline(slope  =  1, color = "black", size = 1.2) +
      coord_fixed(clip = "off", 
                  xlim = c(-0.7,1.6),
                  ylim = c(-0.7,1.6), 
                  expand = expansion(0)) +
      annotate("segment", x = -1.4, xend = 0, y = 0, yend = -1.4) +
      annotate("path", x = diag_hist$xy, y = diag_hist$dens) +
      theme_bw() +
      theme(legend.position = c(.15, .85), 
            plot.margin = unit(c(.1,.1,2,2), "cm")) 
    

    enter image description here