Search code examples
rggplot2geom-barstacked

Ordering of items within a stacked geom_bar


I want, for reasons which seems good to me, to plot a stacked bar chart, with the bars in a specific, data dependent order. For reasons which are obscure to me, it does not seem to work. Specifically, while I can readily arrange the rows of my dataframe in the right order, and make the column of names identifying the bars an ordered factor, so getting the bars in the order I desire, the graph does not list the columns of the dataframe in the order I desire.

An example

tab <- structure(list(Item = c("Personal", "Peripheral", "Communication", "Multimedia", "Office", "Social Media"), `Not at all` = c(3.205128, 18.709677, 5.844156, 31.578947, 20.666667, 25.827815), Somewhat = c(30.76923, 23.87097, 24.67532, 18.42105, 30, 16.55629), `Don't know` = c(0.6410256, 2.5806452, 1.9480519, 11.1842105, 2.6666667, 5.9602649), Confident = c(32.69231, 29.67742, 33.11688, 17.10526, 23.33333, 27.15232), `Very confident` = c(32.69231, 25.16129, 34.41558, 21.71053, 23.33333, 24.50331)), .Names = c("Item", "Not at all", "Somewhat", "Don't know", "Confident", "Very confident"), row.names = c(NA, -6L), class = "data.frame")

Title <- 'Plot title'
ResponseLevels <- c("Not at all", "Somewhat", "Don't know", "Confident", "Very confident") # Labels for bars

pal.1 <- brewer.pal(category, 'BrBG') # Colours

tab <- tab %>% arrange(.[,2]) # Sort by first columns of responses
tab$Item <- factor(tab$Item, levels = tab$Item[order(tab[,2])], ordered = TRUE) # Reorder factor levels

tab.m <- melt(tab, id = 'Item')
tab.m$col <- rep(pal.1, each = items) # Set colours

g <- ggplot(data = tab.m, aes(x = Item, y = value, fill = col)) + 
    geom_bar(position = "stack", stat = "identity", aes(group = variable)) +
    coord_flip() +
    scale_fill_identity("Percent", labels = ResponseLevels, 
                        breaks = pal.1, guide = "legend") +
    labs(title = Title, y = "", x = "") +
    theme(plot.title = element_text(size = 14, hjust = 0.5)) +
    theme(axis.text.y = element_text(size = 16,hjust = 0)) +
    theme(legend.position = "bottom")

g

stacked bar chart, running in the wrong direction

The stacked pieces of the bars run from right to left, from 'Not at all' to 'Very confident'. The items are in the correct order, from 'Multimedia' to 'Personal', ordered by the proportion of those who said 'Not at all' to each item.

What I want to get is this graph with the responses ordered the other way, the same way as the legend, that is from 'Not at all' on the left, to 'Very confident' on the right. I cannot figure out how this ordering is set, nor how to change it.

I've read through the 'similar questions', but can see no answer to this specific query. Suggestions, using ggplot, not base R graphics, welcome.

Ok, building on the useful, and much appreciated answer from allstaire, I try the following

library(tidyverse)

tab <- structure(list(Item = c("Personal", "Peripheral", "Communication", "Multimedia", "Office", "Social Media"), `Not at all` = c(3.205128, 18.709677, 5.844156, 31.578947, 20.666667, 25.827815), Somewhat = c(30.76923, 23.87097, 24.67532, 18.42105, 30, 16.55629), `Don't know` = c(0.6410256, 2.5806452, 1.9480519, 11.1842105, 2.6666667, 5.9602649), Confident = c(32.69231, 29.67742, 33.11688, 17.10526, 23.33333, 27.15232), `Very confident` = c(32.69231, 25.16129, 34.41558, 21.71053, 23.33333, 24.50331)), .Names = c("Item", "Not at all", "Somewhat", "Don't know", "Confident", "Very confident"), row.names = c(NA, -6L), class = "data.frame")

tab <- tab %>% select(1,6,5,4,3,2,1) ## Re-order the columns of tab

tab.m <- tab %>% arrange(`Not at all`) %>%
mutate(Item = factor(Item, levels = Item[order(`Not at all`)])) %>% 
gather(variable, value, -Item, factor_key = TRUE)

ggplot(data = tab.m, aes(x = Item, y = value, fill = variable)) + 
geom_col() +
coord_flip() +
scale_fill_brewer("Percent", type = 'cat', palette = 'BrBG', 
                  guide = guide_legend(reverse = TRUE)) +
labs(title = 'Plot title', y = NULL, x = NULL) +
theme(legend.position = "bottom")

And this is exactly the graph I want, so my pressing problem is solved.

Stacked bar plot laid out correctly

However, if I say instead

ggplot(data = tab.m, aes(x = Item, y = value, fill = variable)) + 
geom_col() +
coord_flip() +
scale_fill_brewer("Percent", type = 'cat', palette = 'BrBG', 
                  guide = guide_legend(reverse = FALSE)) +
labs(title = 'Plot title', y = NULL, x = NULL) +
theme(legend.position = "bottom")

The picture I get is this

Stacked bar chart, legend going in wrong direction

Here the body of the chart is correct, but the legend is going in the wrong direction.

This solves my problem, but does not quite answer my question. I start with a dataframe, and to get what I want I have to reverse the order of the data columns, and reverse the guide legend. This evidently works, but it's perverse.

So, how does a stacked bar chart decide in what order to present the stacked items? It's clearly related to their order in the melted dataset, but simply changing the order leaves the legend going in the wrong direction. Looking at the melted dataset, tab.m, from top to bottom, the responses are in the order 'Very confident' to 'Not at all', but the default legend is the reverse order 'Not at all' to 'Very confident'.


Solution

  • If you pass guide_legend instead of just a string, you can set its reverse parameter to TRUE. Simplifying a bit,

    library(tidyverse)
    
    tab <- structure(list(Item = c("Personal", "Peripheral", "Communication", "Multimedia", "Office", "Social Media"), `Not at all` = c(3.205128, 18.709677, 5.844156, 31.578947, 20.666667, 25.827815), Somewhat = c(30.76923, 23.87097, 24.67532, 18.42105, 30, 16.55629), `Don't know` = c(0.6410256, 2.5806452, 1.9480519, 11.1842105, 2.6666667, 5.9602649), Confident = c(32.69231, 29.67742, 33.11688, 17.10526, 23.33333, 27.15232), `Very confident` = c(32.69231, 25.16129, 34.41558, 21.71053, 23.33333, 24.50331)), .Names = c("Item", "Not at all", "Somewhat", "Don't know", "Confident", "Very confident"), row.names = c(NA, -6L), class = "data.frame")
    
    tab.m <- tab %>% arrange(`Not at all`) %>%
        mutate(Item = factor(Item, levels = Item[order(`Not at all`)])) %>% 
        gather(variable, value, -Item, factor_key = TRUE)
    
    ggplot(data = tab.m, aes(x = Item, y = value, fill = variable)) + 
        geom_col() +
        coord_flip() +
        scale_fill_brewer("Percent", palette = 'BrBG', 
                          guide = guide_legend(reverse = TRUE)) +
        labs(title = 'Plot title', y = NULL, x = NULL) +
        theme(legend.position = "bottom")
    


    For the edit:

    Bar order is determined by factor level order, which in the above is determined by column order due to the use of gather to create the factor, thoughcoord_flip is making it less obvious. It's easy to reverse level order with levels<- or by reassembling the factor, though. To keep the colors with the same levels, pass direction = -1 to scale_fill_brewer to reverse their order, as well.

    tab.m <- tab %>% arrange(`Not at all`) %>%
        mutate(Item = factor(Item, levels = Item[order(`Not at all`)])) %>% 
        gather(variable, value, -Item, factor_key = TRUE) %>% 
        mutate(variable = factor(variable, levels = rev(levels(variable)), ordered = TRUE))
    
    ggplot(data = tab.m, aes(x = Item, y = value, fill = variable)) + 
        geom_col() +
        coord_flip() +
        scale_fill_brewer("Percent", palette = 'BrBG', direction = -1,
                          guide = guide_legend(reverse = TRUE)) +
        labs(title = 'Plot title', y = NULL, x = NULL) +
        theme(legend.position = "bottom")