My question relates to this article by Davis and Chen (2006), in which it is shown a way to visualise Kendall's tau measure of non-parametric correlation between two variables.
Given a number of datapoints in a scatterplot, each point is connected to all the other points by a line segment. A line segment can be of different colours following these criteria:
Here is an example from the original article:
My problem is that I can generate a scatterplot, but not the line segments that connect all possible pairs of points, changing colour depending on the criteria above.
Here is an example of dataset:
dataset <- dplyr::tibble(alpha = c(1, 5, 7, 8, 9, 10, 11, 12),
beta = c(7, 7, 5, 4, 3, 14, 15, 18))
I can generate this:
ggplot2::ggplot(dataset, aes(x = alpha, y = beta)) + geom_point()
but not this:
NOTE. The solution has to be generalisable to a dataset with a large number of datapoints (~1000)
There's many ways, but you need to build your own data.frame of segments. E.g.
library(tidyverse)
pd <- dataset %>%
mutate(d = map(row_number(), function(x) slice(., -x) %>% rename(x = alpha, y = beta))) %>%
unnest(d) %>%
mutate(
slope = (y - beta) / (x - alpha),
cat = case_when(
is.infinite(slope) | slope > 0 ~ 'a',
slope < 0 ~ 'b',
slope == 0 ~ 'c'
)
)
ggplot() +
geom_segment(aes(alpha, xend = x, beta, yend = y, color = cat), pd) +
geom_point(aes(alpha, beta), dataset) +
scale_color_manual(values = c(a = 'black', b = 'red', c = 'blue'))