Search code examples
rggplot2waffle-chart

geom_waffle not showing one cell


I've been working with the package waffle and I can't solve how to fix this waffle plot so that it shows 100 squares/cells

Waffle plot

This is the data and my code:

# devtools::install_github("hrbrmstr/waffle")
library(ggplot) 
library(waffle)

data <- structure(list(greenfield = c(0, 1), n = c(162L, 399L), total_investments = c(561, 
561), percentage = c(28.8770053475936, 71.1229946524064), name_greenfield = c("M&A", 
"Greenfield")), row.names = c(NA, -2L), groups = structure(list(
    greenfield = c(0, 1), .rows = structure(list(1L, 2L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

# Waffle plot

ggplot(data,
       aes(fill = name_greenfield,
           values = percentage))  +
  coord_equal() +
  geom_waffle(size = 0.31,  
              flip = T, 
              make_proportional = T, 
              height = 0.9, width = 0.9,
              color = "white") +
  coord_equal() + 
  theme_ipsum_rc(grid="",
                 plot_title_face = "plain",
                 plot_title_size = 14,
                 plot_title_margin = 0) +
  theme_enhance_waffle()  +
  labs(title = "Types of investments",
       fill = "",
       caption = "Source: Author's own elaboration") + 
  scale_fill_grey(start = 0.4, end = 0.7)

I haven't found the issue and would really appreciate if someone could help me!


Solution

  • There is a very, very odd bug going on in the code. Here is what I suspect is going on. In the source code on GitHub, there is a file stat-waffle.R. At line 93 you will find:

    if (params[["make_proportional"]]) {
      .x[["values"]] <- .x[["values"]] / sum(.x[["values"]])
      .x[["values"]] <- round_preserve_sum(.x[["values"]], digits = 2)
      .x[["values"]] <- as.integer(.x[["values"]] * 100)
    }
    

    Something very strange here happens for your values.

    .x <- list(values = c(29, 71))
    
    .x[["values"]] <- .x[["values"]] / sum(.x[["values"]])
    # [1] 0.29 0.71
    .x[["values"]] <- waffle:::round_preserve_sum(.x[["values"]], digits = 2)
    # [1] 0.29 0.71
    .x[["values"]] <- as.integer(.x[["values"]] * 100)
    # [1] 28 71
    

    That is why you are seeing one missing square. It has to do with floating point math.

    format(round(.x$values / sum(.x$values), 2) * 100, digits = 22)
    # [1] "28.999999999999996" "71.000000000000000"
    

    If you make this modification, the result would be as expected.

    .x[["values"]] <- as.integer(round(.x[["values"]] * 100))
    # [1] 29 71
    

    What does this mean for you? Round your numbers to be integers that sum to 100 ahead of time and use make_proportional = FALSE. That way that block of code does not run.

    data$percentage <- as.integer(round(round(data$percentage / sum(data$percentage), 2) * 100))
    data$percentage
    # [1] 29 71
    
    ggplot(data,
           aes(fill = name_greenfield,
               values = percentage))  +
      coord_equal() +
      geom_waffle(size = 0.31,  
                  flip = T, 
                  make_proportional = FALSE, 
                  height = 0.9, width = 0.9,
                  color = "white") +
      coord_equal() + 
      theme_enhance_waffle()  +
      labs(title = "Types of investments",
           fill = "",
           caption = "Source: Author's own elaboration") + 
      scale_fill_grey(start = 0.4, end = 0.7)
    

    And this should give you what you were expecting.

    waffle plot