Search code examples
rggplot2geom-text

Strategies in mapping text size to data in ggplot


I would like to consult on how to map text size to data in ggplot(). In the following silly example, I have data describing some English letters and the average score of "liking" each letter received. That is, imagine that we surveyed people and asked them, "to what extent do you like the letter [ ], on scale of 1-7, where 1 means strongly dislike, and 7 means like very much".

For statistical reasons that are beyond the scope of this question, I do not want to use a bar plot, as I seek to minimize the desire to compare between the mean values. Hence, I chose a different visualization, as seen below.

My issue is: I want to give the viewer a feeling that accounts for the difference in values. So I decided to map the size of geom_text() to the actual value presented. However, this gets a little tricky when I try to make it look nice.

library(ggplot2)
library(ggforce)

my_df <-
  data.frame(
  letter = letters[1:16],
  mean_liking = c(
    3.663781,
    3.814590,
    3.806543,
    3.788288,
    3.756278,
    4.491339,
    3.549708,
    3.799703,
    3.651306,
    4.522255,
    4.075301,
    5.619614,
    3.917391,
    2.579243,
    3.692090,
    4.439822
  )
)

## scenario 1 -- without mapping size 
ggplot(data = my_df) +
  geom_circle(aes(x0 = 0, y0 = 0, r = 0.5, fill = letter), show.legend = FALSE) +
  geom_text(aes(label = round(mean_liking, 2), x = 0, y = 0)) +
  coord_fixed() +
  facet_wrap(~letter) +
  theme_void()

## scenario 2 --  mapping size "plainly" (so to speak)
ggplot(data = my_df) +
  geom_circle(aes(x0 = 0, y0 = 0, r = 0.5, fill = letter), show.legend = FALSE) +
  geom_text(aes(label = round(mean_liking, 2), x = 0, y = 0, 
                size = mean_liking)) + # <-- mapped here
  coord_fixed() +
  facet_wrap(~letter) +
  theme_void()

  
## scenario 3 --  mapping size multiplied by 10
ggplot(data = my_df) +
  geom_circle(aes(x0 = 0, y0 = 0, r = 0.5, fill = letter), show.legend = FALSE) +
  geom_text(aes(label = round(mean_liking, 2), x = 0, y = 0, 
                size = mean_liking*10)) + # <-- mapped here; getting strange
  coord_fixed() +
  facet_wrap(~letter) +
  theme_void()

Created on 2021-08-17 by the reprex package (v2.0.0)


As can be seen above, both scenario 2 and 3 resulted in unreadable text size for letter n. So I have a couple of questions:

  1. Why does text size remain the same, despite multiplying by 10?
  2. How could I have the text size vary according to the mean_liking value?
  3. Is there any useful strategy that takes into account the fact that those means were generated from a finite scale that ranges 1-7? I guess this implies some subjective judgment on how one would choose to visualize it, but I'm very interested to get more perspectives on this.

Thank you!


Solution

  • Regarding your questions:

    1. ggplot doesn't actually take the values of our data and uses them directly as size-values, but it scales them with a minimum of 1 and a maximum of 6. So regardless of how you transform your data, the result always looks the same.

    2. To actually change the size, you have to change the ranges of size with scale_size(range = c(min_size, max_size)) or don't map mean_liking to size, but instead set size to the values of mean_liking outside of aes()

    # Mapping mean_liking to size and change size-ranges
    ggplot(data = my_df) +
      geom_circle(aes(x0 = 0, y0 = 0, r = 0.5, fill = letter), show.legend = FALSE) +
      geom_text(aes(label = round(mean_liking, 2), x = 0, y = 0, 
                    size = mean_liking)) + # <-- mapped here
      scale_size(range = c(4, 8)) + 
      coord_fixed() +
      facet_wrap(~letter) +
      theme_void()
    
    # setting size to the values of mean_liking directly
    ggplot(data = my_df) +
      geom_circle(aes(x0 = 0, y0 = 0, r = 0.5, fill = letter), show.legend = FALSE) +
      geom_text(aes(label = round(mean_liking, 2), x = 0, y = 0), 
                size = my_df$mean_liking) + 
      coord_fixed() +
      facet_wrap(~letter) +
      theme_void()
    
    1. Unfortunately I don't have any good strategies here, sorry.