Search code examples
rggplot2colorsbar-chartfill

How to create a stacked barplot using Groups to determine Fill and X variable to determine Color (ggplot2)?


I cannot manage to get a stacked barplot using different colors for the fill (same in each column) and for the trim around the columns (different trim for each column) using ggplot2. The fill is defined by the same variable than the group, and the color by the same variable than the x axis. The fill works, but the color doesn't, as R does not seem to recognize the grouping

I am creating a percentual stacked barplot using ggplot2. One variable with three levels defines the Group ("Geology") and another one with four levels the x axis ("Region"). The y axis is percentual, as the four regions have a different number of datapoints and I want to do a visual comparison between them. This is, the resulting plot has 4 columns of the same height (one per region) and each column can be divided into three (or less) segments.

I want a different filling color for each Group, hence I use "group=Geology, fill=Geology". I also want to have every column encased by one color each, that should be defined by the Region (matching other figures in the same project), so I use "x=Region, color=Region".

ggplot() + 
  geom_bar(data=All_data, position=position_fill(), aes(group=Geology, x=Region, color=Region, fill=Geology), show.legend = TRUE) +
  theme_classic() +
  ggtitle("Geology") +
  scale_y_continuous(name = NULL, labels= NULL) +
  scale_color_manual(values=c("#B71C1C",
                              "#923ffc",
                              "#fce571",
                              "#70C339"))+
  scale_fill_manual(values=c("black",
                             "darkgrey",
                             "lightgrey"))+
  scale_x_discrete(labels=c("Alps","Central Highlands","Central Plains","Iberia"), name=NULL, guide = guide_axis(angle = 45))

Nevertheless, when I try to plot it, the "color" part doesn't work

Result

And R provides me an error message

Error

To be more clear, this is the desired outcome:

Desired outcome

I am not sure where is my mistake, maybe you can help me

Thanks a lot!

Edit:

Following the code suggested in the comments, I added stat="identity" and a numeric value for y, and that seems to partially solve the issue (I don't understand why!)

ggplot() + 
  geom_bar(data=All_data, position="fill", stat="identity", aes(fill=Geology, y=1, x=Region, colour=Region), show.legend = TRUE) +
  theme_classic() +
  ggtitle("Geology") +
  scale_y_continuous(name = NULL, labels= NULL) +
  scale_colour_manual(values=c("#B71C1C",
                              "#923ffc",
                              "#fce571",
                              "#70C339"))+
  scale_fill_manual(values=c("#555555",
                             "#999999",
                             "#DDDDDD"))+
  scale_x_discrete(labels=c("Alps","Central Highlands","Central Plains","Iberia"), name=NULL, guide = guide_axis(angle = 45))

Nevertheless, instead of producing a single trim surrounding each segment of the columns, as in your example, it generates a number of divisions equal to the number of observations (ie. if there are three cases where the Geology is "Calcareous" in the Region "Iberia", the corresponding portion of the column is divided in three)

New plot

Edit 2:

The problem was solved in the comments. The solution was to get rid of the "group". Check out the comment section for the explanation


Solution

  • Simply remove the group aesthetic from inside aes. The group = Geology aesthetic mapping here means that you want all fills and colours to be the same within each geology type, but this is explicitly not what you are trying to do. Instead, you wish each unique combination of region and geology to have its own unique combination of colour and fill.

    If you remove the group aesthetic, this is equivalent to using the default grouping, which is all the combinations of the variables assigned to the color and fill aesthetics (in other words, we would get the same effect if we used group = interaction(Region, Geology)

    library(ggplot2)
    
    ggplot(All_data) + 
      geom_bar(position = position_fill(), size = 1.5,
               aes(x = Region, color = Region, fill = Geology)) +
      theme_classic() +
      ggtitle("Geology") +
      scale_y_continuous(name = NULL, labels= NULL) +
      scale_color_manual(values=c("#B71C1C", "#923ffc", "#fce571", "#70C339"),
                         guide = 'none') +
      scale_fill_manual(values=c("black", "darkgrey","lightgrey")) +
      scale_x_discrete(labels = c("Alps", "Central Highlands", "Central Plains", 
                                  "Iberia"), name = NULL, 
                       guide = guide_axis(angle = 45))
    

    enter image description here


    Data used (inferred approximately from plot in question)

    All_data <- data.frame(Geology = rep(rep(c('Calcareous', 'Mixed', 'Siliceous'), 
                                             4), times = c(32, 0, 68, 23, 0, 77,
                                                           15, 85, 0, 11, 11, 78)),
                           Region = rep(c("Alps", "Central Highlands", 
                                          "Central Plains", "Iberia"), each = 100))