Search code examples
ggplot2charts

Need to force specific colors to specific values of a field in a ggplot geom_col chart


I'm using data from Bureau of Transportation Statistics for Border Crossings and trying to get proficient with R by reproducing a dashboard I built in Tableau (published here). I want to use the same colors in the work I'm doing in R, but struggling to force specific colors to values of the field "Measure" (Personal Vehicles, Pedestrians, Trucks, etc).

It should look like this: Tableau Version with proper colors

But this is what I'm getting with ggplot.
my ggplot results

Here's my code

data_00_base <- reader("Border_Crossing_Entry_Data.csv")
data_00_base <- data.frame(data_00_base)
data_00_base$Value <- as.numeric(gsub("[^0-9.-]", "", data_00_base$Value))


# Make this happen
# Line Chart of Border Crossings per Month, Mexico vs Canada
# Bar Charts stratified by Modes of Crossing the Border, MEX vs CAN
# Nuances
#   Any drastic changes in the 20+ years of data collection.  COVID hit in March 2020.  How long was the recovery.  
#       Did 9/11 impact border crossings
#   MEX vs CAN, Volume, cyclical patterns, almost no Pedestrians from Canada, and MEX about 4:1 on most Modes.  CAN wins for Trains.
# 
#   Personal Vehicles
#      Strong cyclical for CAN
#      MEX steady # of cars but decreasing # of Passengers. Show graph that shows ratio of Passengers/Vehicle vs time
#      MEX peaks in December (holidays), trough in Feb

# rm(data_10a)

# Add some features to the BASE data frame.  This is the BASE that all sub-queries (data frames) will pull from.
data_00_base_mod <- data_00_base %>%
  mutate(Border_Code = ifelse(grepl("Canada", Border, fixed = TRUE), "CAN", "MEX")) %>%
  separate(Date, c("Month", "Year"), remove = FALSE)  %>%
  mutate(Measure = factor(Measure, order = TRUE, levels = c("Bus Passengers","Buses","Pedestrians","Personal Vehicle Passengers","Personal Vehicles","Rail Containers Empty","Rail Containers Loaded","Train Passengers","Trains","Truck Containers Empty","Truck Containers Loaded","Trucks")))

  # Augment with a dimension table to help control colors and sorting in the presentation tier
  # This is all stuff I decided on when building the Tableau dashboard.  I want to follow that lead with R to make it easier to compare/contrast output side-by-side
  dim_measure <- data.frame(
    Measure=c("Bus Passengers","Buses","Pedestrians","Personal Vehicle Passengers","Personal Vehicles","Rail Containers Empty","Rail Containers Loaded","Train Passengers","Trains","Truck Containers Empty","Truck Containers Loaded","Trucks")
    ,mode_main_sort=c(8,7,3,2,1,11,10,12,9,6,5,4)
    ,mode_color_code=c("#C799BC","#8074A8","#4E79A7","#F59C3C","#C14F22","#CDCECD","#5B6570","#89C8CC","#848E93","#F4D166","#B2C25B","#34844A")
  )
  
  # Left Join the Base table with supplemental information about the Measures to help with the presentation layer
  data_00_base_mod <- merge(x = data_00_base_mod, y = dim_measure, by = "Measure", all.x = TRUE)

# Round #1
# High-level summary of CAN vs MEX vs time vs Mode
data_10a_core <- data_00_base_mod %>%
  select(Year, Border_Code, Measure, Value, mode_main_sort, mode_color_code) %>%
  filter(Measure == "Personal Vehicles" | Measure == "Pedestrians" | Measure == "Trucks" | Measure == "Buses" | Measure =="Trains") %>%
  group_by(Year, Border_Code, Measure, mode_main_sort, mode_color_code) %>%
  summarise(annual_crossings = sum(Value), record_cnt = n(), .groups = 'keep') %>%
  arrange(Border_Code, Year, mode_main_sort, Measure)

#############################################################################
# NOT Working as Expected
# I've tried lots of different options, but can't figure out how to assign
# the values in the dim_measure data frame to the Measures field.
#
# Many thanks for anybody that can get me over this hurdle.
#############################################################################
ggplot(data_10a_core) + 
    geom_col(aes(x=Year, y=annual_crossings, color = mode_color_code, fill = mode_color_code)) +
    facet_grid(, vars(Border_Code)) + 
    theme(axis.text.x = element_text(angle=90)) +
    scale_y_continuous("Annual Inbound Border Crossings",
       # breaks = scales::breaks_extended(8),
       labels = scales::label_number()  
    )

Solution

  • If you want the geoms to use the fill and colour values you have mapped with aes(), add

    scale_fill_identity()+
    scale_color_identity()