Search code examples
rggplot2ggridges

How to connect bar plot, just like geom_density_ridges does for histogram


I am in trouble of creating plots in R. If I have data like

enter image description here

I want to create:

enter image description here

with x-axis be Sepal.length, Sepal.Width, Petal.Width, Petal.Length ,y-axis be different species and height be the values. And also fill each bar plot with different color according to y-axis.

Thank you!

So far, I have tried:

iris_mean <- aggregate(iris[,1:4], by=list(Species=iris$Species), FUN=mean) 
library(reshape2)
df_mean <- melt(iris_mean, id.vars=c("Species"), variable.name = "Samples", 
  value.name="Values")

ggplot(df_mean,aes(Samples,Values))+
geom_bar(aes(fill=Species),stat="identity")+
  facet_grid(Species~.,scale='free',space='free')+theme(panel.margin = unit(0.1, "lines"))


ggplot(df_mean,aes(x=Samples,y=Species,height =Values))+
  geom_density_ridges2(aes(fill=Species),stat='identity',
                       scale=1.5,
                       alpha=0.1,
                       lty = 1.1)

Solution

  • Your facetted plot is on the right track. Like I said in my comment, you're trying to display a distribution of values, not the means of values. You could set breaks manually and calculate counts to show in a geom_bar, but that would easily get very complicated, especially since the different types of measures are on different scales. I'd recommend just sticking with a simple histogram. I used gather rather than melt to make long data—that's just preference.

    Beyond what you've got, it's a matter of 1. working with distributions, and 2. being clever with the theme. If you move the facet labels, rotate the left-side strips, take out the strip background, and remove vertical spacing between panels, you've essentially got a ridge plot. I'm not very familiar with ggridges, but I'd guess it does something similar. From here, you can adjust how you see fit.

    library(tidyverse)
    
    iris_long <- as_tibble(iris) %>%
      gather(key = measure, value = value, -Species)
    
    ggplot(iris_long, aes(x = value, fill = Species)) +
      # geom_density_ridges() +
      geom_histogram(show.legend = F) +
      scale_y_continuous(breaks = NULL) +
      labs(x = "Measure", y = "Species") +
      facet_grid(Species ~ measure, scales = "free", switch = "both") +
      theme(strip.background = element_blank(), strip.text.y = element_text(angle = 180), 
            strip.placement = "outside", panel.spacing.y = unit(0, "cm"))
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    

    Created on 2018-07-19 by the reprex package (v0.2.0).