Search code examples
rggplot2geom-text

geom_text: Conditional horizontal alignment of labels (hjust using ifelse)


I've been racking my mind about an issue I have with the outward alignment of geom_text labels on a geom_col chart. I've cooked up an MWE that reproduces my problem. The problem is that I have columns in both the positive and the negative territory, and some negative values are much larger than in the positive, moving the "0" mark of my axis on one side.

Here is the general setting (note that I'm on Windows 11 and use R version 4.2.2) and the data:

library(ggplot2)
library(dplyr)
library(tidyr)
ggplot2::theme_set(theme_minimal())

df <- tibble(period = c("one_y", "three_y", "five_y", "seven_y", "ten_y", "whole_period"),
             nominal_gross = c(-21.13, NA, NA, 3.04, 4.1, 4),
             nominal_net = c(-21.64, -3.48, 0.6, 2.58, 3.69, 3.7),
             real_net = c(-25.65, -8.97, -3.75, -0.89, 1.04, 1.25)) %>%
  pivot_longer(!period, names_to = "return_type", values_to = "return_values")

I have tried to use hjust="outward", but as I understand, that command does not take the "0" on the axis as reference, but the middle point of the plot area (which in my case is around -10). Here is the code, and I insert the resulting graph below:

ggplot(data = df, 
       mapping = aes(
         x = factor(period, 
                  levels = rev(c("one_y", "three_y", "five_y", "seven_y", "ten_y", "whole_period")),
                  labels = rev(c("1 year", "3 years", "5 years", "7 years", "10 years", "Whole period"))),
         y = return_values, 
         label = scales::label_percent(scale = 1)(return_values),
         fill = return_type)) +
  geom_col(position = position_dodge2(reverse = T), na.rm = T) +
  geom_text(position = position_dodge2(reverse = T, width = 0.9),
            na.rm = T,
            hjust = "outward", 
            vjust = 0.4,
            size = 3,
            check_overlap = T) +
  scale_fill_manual(values = c("#006FB4", "darkgrey", "#01416B"), 
                    labels = c("Nominal gross", "Nominal net", "Real net"),
                    name = NULL) +   
  scale_y_continuous(n.breaks = 10,
                     expand = expansion(add = ifelse(max(abs(df$return_values), na.rm = T)> 10, 3.5, 2)),
                     labels = scales::label_percent(scale = 1)) +
  labs(x = NULL, y = "Annualised returns") +
    coord_flip()

Plot with "outward"

The result is not so bad, but the labels of my negative values between 0 and -10 are inside the bars... Then I tried using ifelse to make hjust conditional upon return_values. Again, here is the code and the resulting graph:

ggplot(data = df, 
       mapping = aes(
         x = factor(period, 
                    levels = rev(c("one_y", "three_y", "five_y", "seven_y", "ten_y", "whole_period")),
                    labels = rev(c("1 year", "3 years", "5 years", "7 years", "10 years", "Whole period"))),
         y = return_values, 
         label = scales::label_percent(scale = 1)(return_values),
         fill = return_type)) +
  geom_col(position = position_dodge2(reverse = T)) +
  geom_text(position = position_dodge2(reverse = T, width = 0.9),
            hjust = ifelse(test = df$return_values < 0, yes = 1.25, no = -0.25),
            vjust = 0.4,
            size = 3,
            check_overlap = T) +
  scale_fill_manual(values = c("#006FB4", "darkgrey", "#01416B"), 
                    labels = c("Nominal gross", "Nominal net", "Real net"),
                    name = NULL) +   
  scale_y_continuous(n.breaks = 10,
                     expand = expansion(add = ifelse(max(abs(df$return_values), na.rm = T)> 10, 3.5, 2)),
                     labels = scales::label_percent(scale = 1)) +
  labs(x = NULL, y = "Annualised returns") +
  coord_flip()

Plot with "ifelse"

This is actually worse: most of my labels are now inside the bars, but some are correctly place. I then conclude that there is an issue with my conditional statement. I have tried to remove NAs, to replace NAs with zeros, but I could not find the source of the problem. If anyone could help me find a solution, I'd be most grateful.

Note that if there is a way to sort of "crop" the plot by "breaking" the axis and deleting the area between -10 and -20, I'd like that as well (the area between the two black lines in the image below). I have tried setting limits within scale_y_continuous and using the oob argument, but with all the solutions I have tried, the longest bars just seem to stop at the limit and I lose the gradation between the three.

Plot with ifelse cropping

EDIT

I've actually just found the answer: the conditional statement must be passed to geom_text with hjust as an aesthetics:

geom_text(aes(hjust = ifelse(return_values < 0, 1.25, -0.25)),
            position = position_dodge2(reverse = T, width = 0.9),
            vjust = 0.4,
            size = 3,
            check_overlap = T) 

Then it works fine.


Solution

  • If you want the labels to to the left of the value for negatives, right for positives, you can incorporate the hjust into the aes() -- this will work more consistently than referencing df$return_values because it will use the same ordering as the bars, which might not correspond to the order in the original data.

      ggplot(data = df, 
             mapping = aes(
               x = factor(period, 
                          levels = rev(c("one_y", "three_y", "five_y", "seven_y", "ten_y", "whole_period")),
                          labels = rev(c("1 year", "3 years", "5 years", "7 years", "10 years", "Whole period"))),
               y = return_values, 
               label = scales::label_percent(scale = 1)(return_values),
               fill = return_type)) +
      geom_col(position = position_dodge2(reverse = T), na.rm = T) +
      geom_text(position = position_dodge2(reverse = T, width = 0.9),
                na.rm = T,
                aes(hjust = ifelse(return_values < 0, 1.1, -0.1)),  ##### SEE HERE #####
                vjust = 0.4,
                size = 3,
                check_overlap = T) +
      scale_fill_manual(values = c("#006FB4", "darkgrey", "#01416B"), 
                        labels = c("Nominal gross", "Nominal net", "Real net"),
                        name = NULL) +   
      scale_y_continuous(n.breaks = 10,
                         expand = expansion(add = ifelse(max(abs(df$return_values), na.rm = T)> 10, 3.5, 2)),
                         labels = scales::label_percent(scale = 1)) +
      labs(x = NULL, y = "Annualised returns") +
      coord_flip(clip = "off")                                 ##### SEE HERE #####
      
    

    enter image description here