Search code examples
rggplot2facetgeom-col

Showing all variables in a facetted bar chart


I'd like to produce a bar chart that highlights different categories - I use geom_col with a facet for each of the countries.

The issue is that the coloured version does not align with the 'greyed' version of the specific country; instead it's always at the bottom of chart.

Steel Production chart

Produced with this code:


# Steel production data
  steel <- tribble(
    ~country,   ~"2016",    ~"2017",    ~"2018",    ~"2019",
    "China",     828.4,  853.7,  863.9,  988.2, 
    "Japan",     104.9,  104.7,  104.2,  99.6, 
    "India",     95.0,   101.5,  107.8,  111.5, 
    "USA",     80.2,     81.6,   84.2,   88.2, 
    "Other",     564.8,  577.7,  587.8,  549.9 
  )

# Pivot the data and turn country into factors
  steel_long <- tidyr::pivot_longer(steel, -country, names_to = "year", values_to = "production")
  names(steel_long) <- tolower(names(steel_long))

  steel_long$country <- as.factor(steel_long$country)
  steel_long$country <- forcats::fct_relevel(steel_long$country, "Other", after = Inf) # Always put RotW last

  steel_long$country2 <- steel_long$country # Add second country to add the grey lines on the mini charts

  steel_long$year <- lubridate::make_date(year = steel_long$year, 12, 31)

# Graph - Column
  ggplot() +
    geom_col(data = steel_long[, 2:4], 
             mapping = aes(x = year, y = production, group = country2), colour = "white", fill = "grey", lwd = 1) +
    geom_col(data = steel_long, mapping = aes(x = year, y = production, fill = country), lwd = 1.1) +
    facet_wrap(~country) +
    labs(title = "Global steel production (Source: World Steel Association)", x = "", y = "Million metric tons") +
    guides(fill = "none") +
    theme_minimal()

Is it possible to colour the specific area of the column that is related to the country?

Thank you


Solution

  • Try this. The basic idea is to replicate the dataset according to the number of countries. When facetting the dataset is split according to the facet var. By replicating the dataset we ensure that the column plot in each facet is made up of the whole dataset. The only difference between the single datasets is the country_fill column which is used to set a color for the country to highlight while setting the fill color to grey for all others. To set the fill colors in the plot I use scale_fill_identity.

    library(tidyverse)
    
    # Steel production data
    steel <- tribble(
      ~country,   ~"2016",    ~"2017",    ~"2018",    ~"2019",
      "China",     828.4,  853.7,  863.9,  988.2, 
      "Japan",     104.9,  104.7,  104.2,  99.6, 
      "India",     95.0,   101.5,  107.8,  111.5, 
      "USA",     80.2,     81.6,   84.2,   88.2, 
      "Other",     564.8,  577.7,  587.8,  549.9 
    )
    
    # Pivot the data and turn country into factors
    steel_long <- tidyr::pivot_longer(steel, -country, names_to = "year", values_to = "production")
    names(steel_long) <- tolower(names(steel_long))
    
    steel_long$country <- as.factor(steel_long$country)
    steel_long$country <- forcats::fct_relevel(steel_long$country, "Other", after = Inf) # Always put RotW last
    
    steel_long$country2 <- steel_long$country # Add second country to add the grey lines on the mini charts
    
    steel_long$year <- lubridate::make_date(year = steel_long$year, 12, 31)
    
    # Colors
    colors <- scales::hue_pal()(5) %>% 
      setNames(unique(steel_long$country)) %>% 
      tibble::enframe(name = "country3", value = "country_fill")
    
    # Replicate dataframe
    steel_long_rep <- purrr::map(unique(steel_long$country), ~ steel_long) %>% 
      setNames(unique(steel_long$country)) %>% 
      bind_rows(.id = "country3") %>%
      # Join colors
      left_join(colors) %>% 
      # Set fill for non-facet countries to grey
      mutate(country_fill = ifelse(country != country3, "grey", country_fill))
    #> Joining, by = "country3"
    
    steel_long_rep$country3 <- forcats::fct_relevel(steel_long_rep$country3, "Other", after = Inf) 
    
    # Graph - Column
    ggplot() +
      geom_col(data = steel_long_rep, mapping = aes(x = year, y = production, group = country, fill = country_fill), colour = "white", lwd = 1) +
      scale_fill_identity() +
      facet_wrap(~country3) +
      labs(title = "Global steel production (Source: World Steel Association)", x = "", y = "Million metric tons") +
      guides(fill = "none") +
      theme_minimal()
    

    Created on 2020-04-12 by the reprex package (v0.3.0)