Search code examples
rggplot2labelstacked-area-chart

Labeling individual bars in stacked ggplot bar graph


Is there an easy way to label individual "blocks" in a "stacked" bar graph such as the following. I'd like the labels to be positioned near the top of each block, but my latest approach results in somehow swapping the Texts for USA and Mexico as below.

example bar graph

Looking around for a solution, I've only found approaches whereby the y value for the text has to be pre-computed externally, which, aside from the extra logic, brings the issue of controlling the order in which the blocks are stacked...
I also found this stackoverflow question where I got the idea of using a geom="text" in a stat_bin (see code below)
Here's a trimmed down code snippet for illustration of my current approach. I'm not necessarily trying to fix this snippet, any generic idiom to label stack bar-graphs areas will do!
Edit: (in view of the two answer this question got so far)
I'd like to stress that I'd rather solutions which don't imply pre-computing the y position of the text.

# sample data source
df.StackData <- data.frame(
    QType = c("A4-1", "A4-1", "A4-1",  "B3", "B3", "B3"),
    Country = c("Canada", "USA", "Mexico", "Canada", "USA", "Mexico"),
    NbOfCases = c(1000, 1320, 380, 400, 1000, 812),
    AvgRate = c(17.2, 11.4, 44.21, 17.3, 15.3, 39.7),
    Comment = c("Can", "US", "Mex", "Can", "US", "Mex")
)

and the ggplot invocation. It produces the graph shown above, with the odd swap of labels (and also an extra legend, 'though this legend issue is easy to take care of; I just noted it while preparing this question).

ggplot(data=df.StackData,
       aes(x=QType, y=NbOfCases, fill=Country))+
  geom_bar(stat="identity", width=1) +
  stat_bin(geom="text", aes(label=paste("R coef =",
                                        formatC(AvgRate, format="f", digits=3),
                                        "(", Comment, ")" ),
                            vjust=1.5, size=3 
                        )
  )

My initial attempts added a geom_text() to the graph as follow, but of course the y value was wrong (lacing texts relative to the very bottom of graph rather than that to the bottom of the individual blocks) ...

  ... +
  geom_text(mapping=aes(x=QType, y=NbOfCases, 
                        label=paste("R coef =",
                                    formatC(AvgRate, format="f", digits=3),
                                    "(", Comment, ")" ),
                         vjust=1.5),
            size=3)

Solution

  • Here's a solution. There are two things here. First, you should reorder the levels of your data.frame to the same order as you've in your data df.StackData. Second, create another data.frame to calculate the y-position by computing the cumulative sums of the data.

    # reorder levels of factor to the same order as found in data
    df.StackData$Country <- factor(df.StackData$Country, 
              levels=c("Canada", "USA", "Mexico"), ordered=TRUE)
    p <- ggplot(data=df.StackData, aes(x=QType, fill=Country))
    p <- p + geom_bar(aes(weights=NbOfCases))
    
    # compute corresponding y-axis positions by cumulative sum
    require(plyr)
    df <- ddply(df.StackData, .(QType), function(x) {
        x$NbOfCases <- cumsum(x$NbOfCases)
        x
    })
    
    # then use geom_text with data = df (the newly created data)
    p + geom_text(data = df,  aes(x=QType, y=NbOfCases, 
            label=paste("R coef =", 
            formatC(AvgRate, format="f", digits=3), 
            "(", Comment, ")" ), vjust=1.5), size=3)
    

    enter image description here

    Edit: If you don't want to calculate the y-pos yourself, then you'll have to use stat_bin. Just reorder the levels of column Country and it works:

    # data
    df.StackData <- data.frame(
        QType = c("A4-1", "A4-1", "A4-1",  "B3", "B3", "B3"),
        Country = c("Canada", "USA", "Mexico", "Canada", "USA", "Mexico"),
        NbOfCases = c(1000, 1320, 380, 400, 1000, 812),
        AvgRate = c(17.2, 11.4, 44.21, 17.3, 15.3, 39.7),
        Comment = c("Can", "US", "Mex", "Can", "US", "Mex")
    )
    
    # just add this: reorder the level 
    df.StackData$Country <- factor(df.StackData$Country, 
              levels=c("Canada", "USA", "Mexico"), ordered=TRUE)
    
    # your code again using stat_bin (just changed the width to 0.75)
    ggplot(data=df.StackData,
           aes(x=QType, y=NbOfCases, fill=Country))+
      geom_bar(stat="identity", width=.75) +
      stat_bin(geom="text", size=4, aes(label=paste("R coef =",
                                            formatC(AvgRate, format="f", digits=3),
                                            "(", Comment, ")" ),
                                vjust=1.5))
    

    enter image description here