Search code examples
rggplot2legendpercentagedensity-plot

ggplot2 geom_density: correct way to display density as percentage, changing the legend key shape


I am trying to do what I thought would be an easy task, but I am stuck.

I just want to display a density plot as percentages, but ggplot2::scale_y_continuous() does not seem to work the way I expect it to...

Besides, I want to tweak the legend and change the number of columns and the key shape with ggplot2::guide_legend()... the latter does not work (but it does work with ggplot2::geom_point()).

What am I doing wrong?

Here is my MWE with iris data (sorry there might be a bit of extra info, but I kept it as close as possible to my real world code):

data(iris)
iris$ID <- paste("sample", rownames(iris))
selected_marker <- "Sepal.Length"
group_for_color <- "Species"
X <- iris[,selected_marker]
plot_data <- data.frame(X, Group=iris[, group_for_color],
                        ID=iris$ID, stringsAsFactors=FALSE)
names(plot_data)[1] <- selected_marker
plot_data$Group <- factor(plot_data$Group, levels=gtools::mixedsort(unique(plot_data$Group)))
plot_palette <- c("#E41A1C", "#377EB8", "#4DAF4A")
legend_cols <- 1
stroke <- 1
size <- 20
bottom <- 0
top <- 100
breaks <- seq(0, 100, 10)
P <- ggplot2::ggplot(plot_data, ggplot2::aes(x=!!ggplot2::sym(selected_marker), color=Group, fill=Group)) +
  ggplot2::geom_density(alpha=0.25, linewidth=stroke) +
  ggplot2::ggtitle(selected_marker) +
  ggplot2::scale_fill_manual(group_for_color, values=plot_palette, drop=FALSE, guide="none") +
  ggplot2::scale_color_manual(group_for_color, values=plot_palette, drop=FALSE,
                              guide=ggplot2::guide_legend(ncol=legend_cols,
                                                          override.aes=list(shape=19, size=size))) +
  ggplot2::scale_y_continuous("percent", labels=scales::percent, limits=c(bottom, top), breaks=breaks) +
  ggplot2::theme_light() +
  ggplot2::theme(plot.title = ggplot2::element_text(face="bold",size=size),
                 axis.text=ggplot2::element_text(size=size),
                 axis.title=ggplot2::element_text(size=size,face="bold"),
                 legend.text=ggplot2::element_text(size=size),
                 legend.title=ggplot2::element_text(size=size,face="bold"),
                 legend.position="bottom",
                 legend.key.size=grid::unit(0.5,"inch"),
                 plot.margin = grid::unit(c(1,1,1,1), "lines"))
grDevices::pdf(file="test.pdf", height=10, width=10)
print(P)
grDevices::dev.off()

This gets me this plot:

test

Notice how the percentage does not go from 0 to 100%, the distributions are consequently all squashed, and the key shape has not changed to 19 (filled circle).

Any idea how to solve these problems? I thought this would be fairly straightforward. Thanks!


Solution

  • Replace these:

    top <- 1  # since 1 = 100%
    breaks <- seq(0, 1, 0.1)
    
    ...
    ggplot2::geom_density(alpha=0.25, linewidth=stroke, key_glyph = draw_key_point) +
    # remove the limits from scale_y_continuous lest data be excluded for areas >100%. 
    # better to use `coord_continuous(ylim = c(0,1))` if you want to control the viewport
    ggplot2::scale_y_continuous("percent", labels=scales::percent, breaks=breaks) +
    ...
    

    (Keep in mind that density plots need not be vertically constrained to under 100% if the data is less than 1 unit "wide" at a peak point. The area under the density curve (x distance times y height) will sum to 1, but this doesn't mean the peak must be <100%.)

    enter image description here