I'm trying to summarize Likert scale ratings data with colored bubbles in a plot. I currently have a violin plot overlaid over a jittered, faceted scatterplot, which provides a near miss to what I am trying to communicate.
Ideally, I would just have bubbles for each of the points on the Likert scale, sized by the number (or proportion) of items that had that score, and shaded by the mean value of the spKnownShown variable. Making a contingency table for the Likert-facet-x-axis combinations is trivial, but how do I link each cell to the mean of spKnownShown? Any recommendations for taking the next leap into an actual plot from the contingency table would be appreciated.
Apologies that I can't share the data, as it is under a confidentiality agreement.
Consider using functions from the dplyr
package. I first make a fake dataset, where x, y, v, and f correspond to the x-axis, Likert, value for which you want the mean, and facet respectively.
library(ggplot2)
library(dplyr)
n <- 1000
set.seed(1)
d <- data.frame(x = sample(0:1, n, r = T),
y = pmin(rpois(n, 2), 6),
v = rnorm(n),
f = sample(0:2, n, r = T))
Creating the values you want is a combination of using group_by
and summarise
from dplyr
:
plt <- d %>% group_by(f, x, y) %>%
summarise(n = n(), v = mean(v))
Finally, plot:
ggplot(plt, aes(x = factor(x), y = factor(y), size = n, colour = v)) +
geom_point() +
facet_wrap("f")