Search code examples
rggplot2ggrepel

Legend colors and plot colors differ


I'm using ggplot and trying to create a plot that has letters as the labels, but full descriptions in the legend. I'm very close to getting it with this code, but as of now the legend displays the correct color information while the plot shows seemingly random colors.

Dput:

structure(list(Category = c("BNPL", "Digital profile", "Voice", 
"Price matching", "Marketing opt-In", "Promo codes", "Two-factor authentication", 
"Using mobile device to locate a product in a physical store", 
"Profile (shopping journey among different channels)", "Inventory"
), Consumers = c(0.401189529, 0.512550424, 0.428663368, 0.468411444, 
0.496927651, 0.560380251, 0.516336569, 0.520916203, 0.433138295, 
0.551730895), Merchants = c(0.548387097, 0.654121864, 0.543010753, 
0.562724014, 0.571684588, 0.632616487, 0.586021505, 0.575268817, 
0.464157706, 0.575268817), Dif = c(-0.147197568, -0.141571439, 
-0.114347385, -0.09431257, -0.074756937, -0.072236237, -0.069684937, 
-0.054352614, -0.031019411, -0.023537922), Country = structure(c(7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("All", "AUS", 
"BRA", "MEX", "UAE", "UK", "US"), class = "factor"), Year = c(2021L, 
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L
), SumInterest = c(94.9576626, 116.6672288, 97.1674121, 103.1135458, 
106.8612239, 119.2996738, 110.2358074, 109.618502, 89.7296001, 
112.6999712), absdiff = c(14.7197568, 14.1571439, 11.4347385, 
9.431257, 7.4756937, 7.2236237, 6.9684937, 5.4352614, 3.1019411, 
2.3537922), percdiff = c(-0.147197568, -0.141571439, -0.114347385, 
-0.09431257, -0.074756937, -0.072236237, -0.069684937, -0.054352614, 
-0.031019411, -0.023537922), Letter = c("A", "B", "C", "D", "E", 
"F", "G", "H", "I", "J"), color = c("P", "L", "P", "P", "P", 
"L", "L", "L", "P", "L"), hex = c("#ff0054", "#9e0059", "#ff0054", 
"#ff0054", "#ff0054", "#9e0059", "#9e0059", "#9e0059", "#ff0054", 
"#9e0059")), row.names = c(NA, 10L), class = "data.frame")

plot1 <- subset %>%
  ggplot(aes(x = Consumers,
             y = percdiff,
             fill = Category,
             label = Letter)) +
  geom_hline(yintercept=0 ,color = "lightgrey", size=1.5)+
  geom_vline(xintercept=.5, color = "lightgrey", size=1.5)+
  geom_point() +
  geom_label_repel(color="White", segment.color="black") +
  scale_fill_manual(labels = subset$Category, values = subset$hex) +
  guides(
    fill = guide_legend(
      title = '',
      ncol = 1,
      override.aes = list(label = subset$Letter, color = 'white', family = "Lato")
    )
  ) +
  labs(title = "Consumer interest Vs. Awareness Gap",
       subtitle = "In the US 2021",
       x = "Consumer Interest",
       y = "Awareness Gap") +
  theme(legend.position = 'right') +
  scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
  scale_x_continuous(labels = scales::percent_format(accuracy = 0.1))

[enter image description here


Solution

  • To achieve your desired result you have to set the right order:

    1. Just in case: Order your data by your column letters
    2. Order your Category column by the letters where I make use of forecast::fct_inorder
    3. Make use of a named vector of colors which assigns hex values to each of your categories.
    4. After doing so you could you the color vector in scale_fill_manual and get rid of the labels argument which is not necessary.

    Note: I dropped the Lato font for the reprex.

    library(ggrepel)
    #> Loading required package: ggplot2
    library(ggplot2)
    
    # Make named vector of colors
    colors <- setNames(subset$hex, subset$Category)
    # Order by letters
    subset <- subset[order(subset$Letter), ]
    # Set order of factor
    subset$Category <- forcats::fct_inorder(subset$Category)
    
    ggplot(subset, aes(
        x = Consumers,
        y = percdiff,
        fill = Category,
        label = Letter
      )) +
      geom_hline(yintercept = 0, color = "lightgrey", size = 1.5) +
      geom_vline(xintercept = .5, color = "lightgrey", size = 1.5) +
      geom_point() +
      geom_label_repel(color = "White", segment.color = "black") +
      scale_fill_manual(values = colors) +
      guides(
        fill = guide_legend(
          title = "",
          ncol = 1,
          override.aes = list(label = subset$Letter, color = "white")
        )
      ) +
      labs(
        title = "Consumer interest Vs. Awareness Gap",
        subtitle = "In the US 2021",
        x = "Consumer Interest",
        y = "Awareness Gap"
      ) +
      theme(legend.position = "right") +
      scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
      scale_x_continuous(labels = scales::percent_format(accuracy = 0.1))