Search code examples
rggplot2stacked-area-chart

ggplot2 stacked area chart not filling between years


I have data structured as follows (this is merely an example):

    year    company cars
    2011    toyota  609
    2011    honda   710
    2011    ford    77
    2011    nissan  45
    2011    chevy   11
    2012    toyota  152
    2012    honda   657
    2012    ford    128
    2012    nissan  159
    2012    chevy   322
    2013    toyota  907
    2013    honda   656
    2013    ford    138
    2013    nissan  270
    2013    chevy   106
    2014    toyota  336
    2014    honda   957
    2014    ford    204
    2014    nissan  219
    2014    chevy   282

I want to make a stacked area chart. With one data set formatted exactly as above, the formula ggplot(data, aes(x=year,y=cars, fill=company)) + geom_area() fills in the areas between the years nicely, like so:

enter image description here

However, with another data set formatted exactly the same way and generated using exactly the same ggplot code, only using the new data source, ggplot(data2, aes(x=year,y=cars, fill=company)) + geom_area(), the chart does not fill in the area between the years and creates a mess, like so:

enter image description here

You'll notice at each year, all the points connect. The odd gaps are only between years.

Does anyone have any suggestions about the possible source of this error?


Solution

  • You need to order the data according to the column company and year. The following example illustrates this.

    library("ggplot2")
    library("dplyr")
    
    data <- data.frame(years = rep(1991:2000, times = 10), 
                   company = as.factor(rep(1:10, each = 10)), 
                   cars = runif(n = 100, min = 500, max = 1000))
    
    ggplot(data, aes(x = years, y = cars, fill = company)) + 
      geom_area()
    
    # Randomly order data
    data2 <- data[sample(x = 1:100, size = 100, replace = F), ]
    
    ggplot(data2, aes(x = years, y = cars, fill = company)) + 
      geom_area()
    
    # Reordering the data
    data3 <- arrange(data2, company, years)
    
    ggplot(data3, aes(x = years, y = cars, fill = company)) + 
      geom_area()