I'm trying to create a dotplot in R, similar to the following plot, where each group is distinctly separated from the rest: http://www.sthda.com/english/wiki/ggplot2-dot-plot-quick-start-guide-r-software-and-data-visualization
The data I have looks as follows, where I have a value to plot, and a group column that should bin the data into distinct groups (1-5) (similar to the 'dose' column in the Toothache dataset in the previous link):
This is the plotting code I'm currently using:
p<-ggplot(new_df, aes(x=group, y=ploidy)) +
geom_dotplot(binaxis='y', stackdir='centerwhole', binpositions = 'bygroup', binwidth = 0.5, position = "dodge", dotsize = 0.2)
ggplot(new_df, aes(x=group, y=ploidy)) +
geom_dotplot(binaxis='y', stackdir='centerwhole',
stackratio=0, dotsize=0.2, stackgroups = TRUE)
p + stat_summary(fun=median, geom="point", shape=18,
size=3, color="red")
and it returns the following plot: I suspect the issue here is that the majority of the values sit at the 2-3 range, and thus they're overflowing to the other bins/groups.
I tried re-creating the problem with simple datasets like the Toothache dataset, but the issue doesn't reappear in those smaller datasets. Here is a link to the dataset, since recreating the problem with small sample datasets doesn't work: http://sendanywhe.re/Y5O133EM
Any help would be appreciated
I think you are overflowing the allocated space in the chart by using specified locations for each individual observation (sometimes called 'stacking'). Instead you should 'jitter' the positions of the individual observations inside a specific allocated region. Jittering, means to introduce a small amount of randomness to the position of a point to avoid (mostly anyhow) overplotting.
I will illustrate this using graphics from the core
of R for the following fictitious data. This focuses attention
on what is wrong, more than on the specific programming solution
in ggplot
, which I will let you work out.
set.seed(2022)
a = round(rnorm(30, 50, 5))
b = round(rnorm(70, 55, 4))
c = round(rnorm(55, 40, 6))
d = round(rnorm(80, 45, 5))
x = c(a,b,c,d)
g = rep(1:4, c(30,70,55,80))
stripchart(x ~ g, meth="jitter", vertical=T, pch=20)
Sorry, not allowed to post images on this site. Hope you you get the idea.