Search code examples
rformattable

R error with formattable "must return compatible vectors across groups"


I am trying to apply currency formatting to TOTAL in a grouped data frame using formattable. However I get the error: "TOTAL" must return compatible vectors across groups.

Here's some reproducible code that produces the error:

library(dplyr)
library(formattable)

n <- 100
data <- data.frame(
   FY = round(runif(n, 2020, 2025), digits = 0),
   LOBJ = round(runif (n, 1, 10), digits = 0),
   VALUE = rnorm(n, 50, 20)
)

data_summary <- data %>%
   group_by(FY, LOBJ) %>%
   summarize(TOTAL = sum(VALUE)) %>%
   mutate(TOTAL = currency(TOTAL))

What am I doing wrong? Thanks in advance!


Solution

  • You could do it by using summarise() making a list and then unseating the result. That will keep the grouping structure and do the right thing with the currency() formatting:

    library(dplyr)
    library(tidyr)
    library(formattable)
    
    n <- 100
    data <- data.frame(
      FY = round(runif(n, 2020, 2025), digits = 0),
      LOBJ = round(runif (n, 1, 10), digits = 0),
      VALUE = rnorm(n, 50, 20)
    )
    
    data_summary <- data %>%
      group_by(FY, LOBJ) %>% 
      summarise(TOTAL = list(currency(sum(VALUE))), .groups="keep") %>%
      unnest(TOTAL)
    data_summary
    #> # A tibble: 45 × 3
    #> # Groups:   FY, LOBJ [45]
    #>       FY  LOBJ TOTAL     
    #>    <dbl> <dbl> <formttbl>
    #>  1  2020     1 $22.79    
    #>  2  2020     3 $28.04    
    #>  3  2020     4 $100.87   
    #>  4  2020     5 $31.04    
    #>  5  2020     6 $98.79    
    #>  6  2020     9 $51.16    
    #>  7  2021     1 $65.79    
    #>  8  2021     2 $56.60    
    #>  9  2021     3 $263.45   
    #> 10  2021     4 $85.72    
    #> # ℹ 35 more rows
    

    It also works as you wrote it if you ungroup the data before calling mutate().

    data_summary <- data %>%
      group_by(FY, LOBJ) %>%
      summarize(TOTAL = sum(VALUE)) %>%
      ungroup %>% 
      mutate(TOTAL = currency(TOTAL))
    #> `summarise()` has grouped output by 'FY'. You can override using the `.groups`
    #> argument.
    data_summary
    #> # A tibble: 45 × 3
    #>       FY  LOBJ TOTAL     
    #>    <dbl> <dbl> <formttbl>
    #>  1  2020     1 $22.79    
    #>  2  2020     3 $28.04    
    #>  3  2020     4 $100.87   
    #>  4  2020     5 $31.04    
    #>  5  2020     6 $98.79    
    #>  6  2020     9 $51.16    
    #>  7  2021     1 $65.79    
    #>  8  2021     2 $56.60    
    #>  9  2021     3 $263.45   
    #> 10  2021     4 $85.72    
    #> # ℹ 35 more rows
    

    Created on 2024-01-26 with reprex v2.0.2