Search code examples
rggplot2geom-bar

Stacked bar chart : what if each subcategory (fill) is unique?


Context
I have a data set where observations are described using categories and subcategories. Subcategories are such that a given one is linked to one category only (think "category ~ car brand" and "subcategory ~ car model").

Imagine I count the number of cars I can see passing each day through my window. I should get a data set similar to this:

 _______________________________________________________
| Date      | Brand         | Model         | Count     |
|===========:===============:===============:===========|
| 18-01-01  | Ford          | Model T       | 1         |
| 18-01-01  | Ford          | Focus         | 13        |
| 18-01-01  | Tesla         | Model X       | 17        |
| 18-01-02  | Ford          | Model T       | 1         |
| 18-01-02  | Honda         | Civic         | 210       |
| _         | _             | _             | _         |
|___________|_______________|_______________|___________|

Problem
I want to create a (horizontal) bar plot showing the count of entries grouped by category (i.e. each bar represents one brand). Moreover, I would like to have each bar graphically subdivided per model (i.e. the closest to axis part of a bar representing the most counted model of the brand, then the second-most counted model, etc.).

I can produce such graph, but each subcategory is then represented as a single legend item (see the first graph in the example below). But in my real example I have a dozen of categories — each of them using a dozen of subcategories (hence about 150 subcategories in total). This makes such solution not "usable".

enter image description here

Question
How to produce such a graph, where each bar (~brand) has one color (blue/red/yellow), and each subcategory a monochromatic shade/tint variation of it (dark blue/medium blue/light blue; dark red/…)?
(Indeed, subcategories are car models, hence specific to each brand — and not a generic/multi-brand category such as "the most observed model of the brand". Moreover, the amount of model per brand varies.)

enter image description here


Example

I can produce a close result using following code (see data at the bottom of the example):

library(ggplot2)
df = read.csv('fake-data.csv', header = TRUE)
df <- df[order(df$car_brand, decreasing = FALSE),]
ggplot(df, aes(x=car_brand, y=count, fill = car_model)) +
  geom_bar(position = 'stack', stat = 'identity') +
  coord_flip()

However, I get a single color and legend entry for each single model. I would prefer one color per brand, with a variation in shade/tint of the given color for each model of the brand.

Here is the content of fake-data.csv, with obviously fake data:

month,car_brand,car_model,count
18-01,Tesla,Model X,8
18-01,Ford,Model T,11
18-01,Ford,Focus,9
18-01,Ford,Focus,19
18-01,Tesla,Model 3,8
18-01,BMW,1 series,4
18-01,Ford,Model T,18
18-01,Honda,Civic,13
18-01,Ford,Model T,9
18-01,Tesla,Model S,18
18-01,BMW,1 series,6
18-01,Ford,Focus,10
18-01,Honda,Civic,9
18-01,Audi,A6,14
18-01,Audi,R8,19
18-01,Ford,Focus,13
18-01,BMW,1 series,7
18-01,Tesla,Model 3,12
18-01,BMW,1 series,11
18-01,BMW,1 series,9
18-01,BMW,1 series,4
18-01,BMW,1 series,11
18-01,Ford,Model T,17
18-01,Honda,Civic,10
18-01,BMW,1 series,9
18-01,Ford,Focus,19
18-01,Honda,Civic,9
18-01,Ford,Focus,15
18-01,Audi,A8,12
18-01,Tesla,Model X,6
18-01,Honda,Civic,14
18-01,BMW,1 series,16
18-01,Tesla,Model X,18
18-01,Tesla,Model X,16
18-01,Audi,TT,20
18-01,Tesla,Model 3,9
18-01,Tesla,Model X,21
18-01,BMW,1 series,9
18-01,Audi,A8,18
18-01,BMW,1 series,2
18-01,Ford,Focus,2
18-01,Honda,Civic,7
18-01,Tesla,Model X,9
18-01,Honda,Civic,3
18-01,BMW,1 series,5
18-01,Ford,Focus,14
18-01,Honda,Civic,4
18-01,Tesla,Model S,4
18-01,Honda,Civic,7
18-01,Honda,Civic,13
18-01,Tesla,Model 3,3
18-01,Tesla,Model 3,9
18-01,BMW,1 series,13
18-01,Ford,Model T,11
18-01,Ford,Focus,10
18-01,Tesla,Model S,18
18-01,Audi,Q3,6
18-01,Audi,R8,13
18-01,Tesla,Model X,21
18-01,BMW,1 series,13
18-01,Ford,Focus,17
18-01,Tesla,Model X,14
18-01,Audi,TT,3
18-01,Ford,Model T,11
18-01,Honda,Civic,7
18-01,Ford,Focus,4
18-01,Honda,accord,6
18-01,Ford,Focus,10
18-01,Ford,Model T,10
18-01,Honda,Civic,15
18-01,Ford,Model T,2
18-01,Tesla,Model X,10
18-01,Ford,Focus,11
18-01,Tesla,Model X,14
18-01,Honda,Civic,13
18-01,BMW,1 series,19
18-01,BMW,1 series,21
18-01,Ford,Focus,8
18-01,Tesla,Model X,12
18-01,Honda,Civic,5
18-01,Honda,Civic,14
18-01,Honda,Civic,17
18-01,Audi,R8,16
18-01,Honda,Civic,12
18-01,Audi,A6,20
18-01,Tesla,Model X,4
18-01,Audi,TT,4
18-01,Ford,Focus,16
18-01,Audi,Q3,16
18-01,BMW,1 series,12
18-01,Audi,A8,18
18-01,Honda,Civic,1
18-01,Audi,A8,7
18-01,Audi,Q3,10
18-01,Tesla,Model X,18
18-01,Ford,Focus,19
18-01,Ford,Model T,2
18-01,Tesla,Model 3,15
18-01,Ford,Model T,13
18-01,Ford,Model T,2
18-01,Audi,Q3,14
18-01,BMW,1 series,4
18-01,Audi,R8,1
18-01,Honda,Civic,2
18-01,Tesla,Model 3,4
18-01,BMW,1 series,16
18-01,Audi,A8,5
18-01,Ford,Model T,18
18-01,Tesla,Model X,21
18-01,Ford,Focus,4
18-01,Ford,Focus,7
18-01,BMW,1 series,16
18-01,Tesla,Model X,16
18-01,Tesla,Model 3,14
18-01,BMW,1 series,8
18-01,BMW,1 series,13
18-01,Tesla,Model 3,7
18-01,Ford,Focus,21
18-01,BMW,1 series,14
18-01,BMW,1 series,10
18-01,Ford,Focus,11
18-01,Tesla,Model 3,13
18-01,Honda,Civic,4
18-01,Ford,Focus,11
18-01,Ford,Focus,8
18-01,BMW,1 series,18
18-01,Honda,Civic,18
18-01,Honda,Civic,15
18-01,Ford,Focus,9
18-01,Tesla,Model 3,4
18-01,BMW,1 series,5
18-01,Tesla,Model S,5
18-01,Audi,TT,12
18-01,Honda,Civic,17
18-01,BMW,1 series,9
18-01,Honda,Civic,7
18-01,Tesla,Model 3,15
18-01,Audi,A8,21
18-01,Ford,Model T,21
18-01,Ford,Model T,9
18-01,BMW,1 series,18
18-01,Tesla,Model 3,7
18-01,BMW,1 series,15
18-01,BMW,1 series,2
18-01,Ford,Model T,18
18-01,Audi,R8,17
18-01,Tesla,Model 3,3
18-01,Audi,A8,9
18-01,BMW,1 series,10
18-01,Audi,Q3,4
18-01,BMW,1 series,8
18-01,Honda,accord,19
18-01,Tesla,Model S,6
18-01,Audi,TT,18
18-01,Audi,Q3,21
18-01,Tesla,Model S,3
18-01,Tesla,Model S,9
18-01,Audi,Q3,1
18-01,Tesla,Model X,18
18-01,Honda,Civic,8
18-01,Audi,R8,14
18-01,Honda,Civic,21
18-01,Tesla,Model X,9
18-01,Audi,TT,16
18-01,Audi,A8,19
18-01,Ford,Focus,2
18-01,BMW,1 series,12
18-01,Ford,Model T,9
18-01,Tesla,Model X,9
18-01,Audi,R8,18
18-01,Honda,Civic,3
18-01,Honda,accord,7
18-01,Audi,A6,13
18-01,Audi,A8,13
18-01,Ford,Focus,8
18-01,Honda,accord,10
18-01,Audi,R8,20
18-01,Honda,Civic,18
18-01,Ford,Focus,7
18-01,Audi,R8,10
18-01,Audi,A6,13
18-01,Honda,Civic,4
18-01,Audi,A8,7
18-01,Audi,Q3,15
18-01,Honda,Civic,10
18-01,Audi,A8,6
18-01,Honda,Civic,1
18-01,Tesla,Model 3,21
18-01,Ford,Model T,7
18-01,BMW,1 series,6
18-01,Honda,Civic,4
18-01,Audi,A6,12
18-01,Honda,Civic,6
18-01,Tesla,Model S,17
18-01,Tesla,Model S,2
18-01,Tesla,Model X,6
18-01,Audi,A8,2
18-01,Tesla,Model 3,14
18-01,BMW,1 series,4
18-01,BMW,1 series,20
18-01,Honda,accord,17
18-01,Honda,Civic,14
18-01,BMW,1 series,16
18-01,Audi,A8,17
18-01,Audi,A6,11
18-01,Ford,Model T,1
18-01,BMW,1 series,18
18-01,Tesla,Model 3,11
18-01,Honda,Civic,21

Solution

  • Here's a way to map to alpha - it's the closest thing I know of to get what you are after.

    library(tidyverse)
    df %>%
      group_by(car_brand, car_model) %>%
      summarise_at(vars(count), sum) %>%
      group_by(car_brand) %>%
      mutate(
        model_rank = car_model %>% rank(),
      ) %>%
      ggplot(aes(x = car_brand, y = count, fill = car_brand, alpha = model_rank)) +
      scale_alpha_continuous(range = c(1, .25)) + 
      theme(legend.position = 'none') + 
      geom_bar(position = 'stack', stat = 'identity') + 
      coord_flip()