Search code examples
rggplot2sankey-diagramggalluvial

Adding label data to sankey plot axis with ggalluvial Package


How do I add label names in the white boxes to each of the three axis on the ggalluvial sankey plot from the dataframe? (see image)

Column "Country" has observations for multiple countries. In the example below I am focused on Ireland which has the label "IE". How do I harvest the dataframe for the Country name using the ggplot code below? I will also be doing this task for the "Vaccine" and the "TargetGroup" names so if the solution can include these columns too, I would be grateful. Thank you

Dataframe code and sankey colours - data is from the EU ECDC

data <- read.csv("https://opendata.ecdc.europa.eu/covid19/vaccine_tracker/csv/data.csv", na.strings = "", fileEncoding = "UTF-8-BOM")
colnames(data)[2] <- "Country"
TestData <- data
Ireland <- subset(TestData, Country == "IE")

Combined <- rbind(Ireland ) #removed other countries for ease of reading

Col1 <- "slateblue2"
Col2 <- "chartreuse2"
Col3 <- "tomato2"
Col4 <- "orange2"
Col5 <- "plum3"
Col6 <- "grey43"
Col7 <- "deeppink"

alpha <- 0.2

This is the sankey code - this is where I am NOT CORRECTLY applying labels to the relevant axis

ggplot(Combined,
       aes(weight = FirstDose, axis1 = Vaccine, axis2 = Country, axis3 = TargetGroup)) +
  geom_alluvium(aes(fill = Vaccine, color = Vaccine), 
                width = 0.1, alpha = alpha, knot.pos = 0.4) +
  geom_stratum(width = 0.2, color = "black") +
  geom_label(stat = "stratum", label.strata = TRUE) +
  scale_x_continuous(breaks = 1:3, labels = c("Vaccine", "Country", "TargetGroup"))     +
  scale_fill_manual(values  = c(Col1, Col2, Col3, Col4, Col5, Col6, Col7)) +
  scale_color_manual(values = c(Col1, Col2, Col3, Col4, Col5, Col6, Col7)) +
  ggtitle("ECDC Vaccination data") +
  theme_minimal() +
  theme(
    axis.text.x = element_text(size = 12, face = "bold")
  )

This is the image - you can see that the axes boxes are missing labels

sankey without axes labels


Solution

  • Figured it out - full code chunk here. I hope this helps someone out

    ggplot(Combined,
           aes(weight = FirstDose, axis1 = Vaccine, axis2 = Country, axis3 = TargetGroup)) +
      geom_alluvium(aes(fill = Vaccine, color = Vaccine), 
                    width = 0.1, alpha = alpha, knot.pos = 0.4) +
      geom_stratum(width = 0.2, color = "black") +
      geom_text(stat = "stratum", aes(label = after_stat(deposit)),
                color = "black") + 
      geom_label(stat = "stratum", aes(label = after_stat(stratum)), min.y = 200) + 
      scale_x_continuous(breaks = 1:3, labels = c("Vaccine", "Country", "TargetGroup"))     +
      scale_fill_manual(values  = c(Col1, Col2, Col3, Col4, Col5, Col6, Col7)) +
      scale_color_manual(values = c(Col1, Col2, Col3, Col4, Col5, Col6, Col7)) +
      ggtitle("ECDC Vaccination data") +
      theme_minimal() +
      theme(legend.position = "none",     panel.grid.major = element_blank(),
            panel.grid.minor = element_blank(),     axis.text.y = element_blank(),       
            axis.text.x = element_text(size = 12, face = "bold")
      )
    

    By adding in these two lines (and I don't fully understand them yet) after the geom_stratum() argument. The labels are where I want them. It's not pretty yet.

      geom_text(stat = "stratum", aes(label = after_stat(deposit)),
                color = "black") + 
      geom_label(stat = "stratum", aes(label = after_stat(stratum)), min.y = 200) + 
    

    Solution solution

    The answer was found on this blog post http://corybrunson.github.io/ggalluvial/reference/stat_stratum.html