Search code examples
rdplyrtidyversejanitor

Calculate and append column totals of select columns in a dataframe


I have the following code for calculating certain quantities of interest, specifically the sum of the two right-most columns.

library(dplyr)
library(janitor)

m = c(0, 0.8, 2.3, 4.1, 2.1)
l = c(0.3, 0.8, 0.9, 0.75, 0.25)

mytable = data.frame(l, m)
rownames(mytable) = paste("Group", 1:5)

# Initial population
n0 = c(1,1,1,1,1)

mytable = mytable %>%
  mutate(lm = l * m) %>%
  mutate(n = n0) %>%
  mutate(offspring = lm * n) %>%
  adorn_totals("row") 

This gives the following output:

> mytable
     l   m    lm n offspring
   0.3 0.0 0.000 1     0.000
   0.8 0.8 0.640 1     0.640
   0.9 2.3 2.070 1     2.070
  0.75 4.1 3.075 1     3.075
  0.25 2.1 0.525 1     0.525
 Total 9.3 6.310 5     6.310

I have the following issues:

  • How to isolate the column totals for specific columns? In my case, I would like the column totals for just columns n and offspring. I read the documentation for the adorn_totals() function but I could not figure out how to do this.
  • The row names assigned are missing. How can I make the row names appear and have the word "Total" as the row name for the new row of column totals?
  • The row total does not appear for the first column, which is strange.

Solution

  • An option is to convert the columns other than the required columns to character class and then change it later. Regarding the row names, tibble doesn't allow for row names. We may need to create a column first with rownames_to_column

    library(dplyr)
    library(tibble)
    library(janitor)
    out <- mytable %>% 
        rownames_to_column('rn') %>%
        mutate(lm = l *m, n = n0, offspring = lm * n) %>% 
        mutate(across(-c(n, offspring), as.character)) %>% 
        adorn_totals('row', fill = NA) %>%
        type.convert(as.is = TRUE)
    

    -output

    > out
          rn    l   m    lm n offspring
     Group 1 0.30 0.0 0.000 1     0.000
     Group 2 0.80 0.8 0.640 1     0.640
     Group 3 0.90 2.3 2.070 1     2.070
     Group 4 0.75 4.1 3.075 1     3.075
     Group 5 0.25 2.1 0.525 1     0.525
       Total   NA  NA    NA 5     6.310
    > str(out)
    Classes ‘tabyl’ and 'data.frame':   6 obs. of  6 variables:
     $ rn       : chr  "Group 1" "Group 2" "Group 3" "Group 4" ...
     $ l        : num  0.3 0.8 0.9 0.75 0.25 NA
     $ m        : num  0 0.8 2.3 4.1 2.1 NA
     $ lm       : num  0 0.64 2.07 3.075 0.525 ...
     $ n        : int  1 1 1 1 1 5
     $ offspring: num  0 0.64 2.07 3.075 0.525 ...
     - attr(*, "core")='data.frame':    5 obs. of  6 variables:
      ..$ rn       : chr [1:5] "Group 1" "Group 2" "Group 3" "Group 4" ...
      ..$ l        : chr [1:5] "0.3" "0.8" "0.9" "0.75" ...
      ..$ m        : chr [1:5] "0" "0.8" "2.3" "4.1" ...
      ..$ lm       : chr [1:5] "0" "0.64" "2.07" "3.075" ...
      ..$ n        : num [1:5] 1 1 1 1 1
      ..$ offspring: num [1:5] 0 0.64 2.07 3.075 0.525
     - attr(*, "tabyl_type")= chr "two_way"
     - attr(*, "totals")= chr "row"