Search code examples
rggplot2tidyverse

Multiple stacked bar chart with ggplot


I have a dataset with four variables measuring respondents' view on different topics. I want to plot them into one stacked bar chart so you can compare the values between the different topics.

This are the first rows of the dataset:

lebanon <- structure(list(climate_change = c(
  "Not a very serious problem",
  "Not a very serious problem", NA, NA, "A very serious problem",
  "A somewhat serious problem"
), air_quality = c(
  "A somewhat serious problem",
  "Not a very serious problem", NA, NA, "A very serious problem",
  "A very serious problem"
), water_polution = c(
  "A somewhat serious problem",
  "Not a very serious problem", NA, NA, "A very serious problem",
  "Not at all a serious problem"
), trash = c(
  "A very serious problem",
  "Not a very serious problem", NA, NA, "A very serious problem",
  "A somewhat serious problem"
)), row.names = c(NA, -6L), class = "data.frame")

I did try with the following code based on this site:

lebanon %>%
  filter(!is.na(climate_change), !is.na(air_quality), !is.na(water_polution), !is.na(trash)) %>%
  gather(variable, value, climate_change:trash) %>%
  ggplot(aes(x = variable, y = value, fill = value)) +
  geom_bar(stat = "identity") +
  coord_flip()

Getting this graph:

enter image description here

There are three problems with this graph.

1.) The bar graphs are not the same length.

2.) I don't why there is something written at the location where x-axis hits the y-axis. How do I remove this?

3.) I want to order the values so they make sense, so I orderer them before with:

dataset$climate_change <- factor(dataset$climate_change, levels = c("Not at all a serious problem",
                                                                    "Not a very serious problem",
                                                                    "A somewhat serious problem",
                                                                    "A very serious problem"))

dataset$air_quality <- factor(dataset$air_quality, levels = c("Not at all a serious problem",
                                                                    "Not a very serious problem",
                                                                    "A somewhat serious problem",
                                                                    "A very serious problem"))

dataset$water_polution <- factor(dataset$water_polution, levels = c("Not at all a serious problem",
                                                                    "Not a very serious problem",
                                                                    "A somewhat serious problem",
                                                                    "A very serious problem"))

Yet the values are still unorderer. What am I doing wrong? Or is there a more effective way to make a multiple stacked bar chart?


Solution

  • The main issue with cour code is that you mapped value, i.e. a factor var, on y. Further you can simply use drop_na instead of filter and simply that the levels of value after the gather instead of repeating it for each var. (; Try this:

    BTW: Please put your data into the post with dput(), e.g. dput(head(lebanon)). See my edit to your post. Took more time to clean and get the data right than answering the question. (;

    ** EDIT ** To get the bars ordered in the wanted order I make use of the forcats package. First I add_count the number of respondents thinking the issue is "A very serious problem". Then I fct_reorder variable accordingly, i.e. -n to get it descending. To reverse the order of value I make use of fct_rev.

    lebanon <- structure(list(climate_change = c(
      "Not a very serious problem",
      "Not a very serious problem", NA, NA, "A very serious problem",
      "A somewhat serious problem"
    ), air_quality = c(
      "A somewhat serious problem",
      "Not a very serious problem", NA, NA, "A very serious problem",
      "A very serious problem"
    ), water_polution = c(
      "A somewhat serious problem",
      "Not a very serious problem", NA, NA, "A very serious problem",
      "Not at all a serious problem"
    ), trash = c(
      "A very serious problem",
      "Not a very serious problem", NA, NA, "A very serious problem",
      "A somewhat serious problem"
    )), row.names = c(NA, -6L), class = "data.frame")
    
    library(tidyverse)
    lebanon %>%
      drop_na() %>% 
      gather(variable, value, climate_change:trash) %>%
      add_count(variable, value == "A very serious problem") %>% 
      mutate(value = factor(value, levels = c("Not at all a serious problem",
                                              "Not a very serious problem",
                                              "A somewhat serious problem",
                                              "A very serious problem"))) %>% 
      ggplot(aes(x = forcats::fct_reorder(variable, -n), fill = forcats::fct_rev(value))) +
      geom_bar() +
      coord_flip()