My question can be considering an extension of the following discussion: R expss package: format numbers by statistic / apply different format to alternate rows
I would like to understand the grammar of conditions to be able to write my own custom formats. Consider the 'insert' dataframe from datasets. Then we create the following table thanks to expss:
infert %>%
tab_cells(parity) %>%
### TOTAL
tab_cols(total()) %>%
tab_stat_cases(label="N", total_row_position="none") %>%
### OTHER VARIABLES
tab_cols(education) %>%
tab_stat_cases(label="N", total_row_position="none") %>%
tab_stat_cpct(label="%Col.", total_row_position="none") %>%
tab_pivot(stat_position="inside_columns") %>%
format_vert()
The last line operates basic formatting, as discussed in the URL above. In details:
format_vert = function(tbl, pct_digits=2, n_digits=0){
#Finding columns to format
pct_cols = grepl("\\|%Col.$", names(tbl), perl = TRUE)
n_cols = grepl("\\|N$", names(tbl), perl = TRUE)
#Format
recode(tbl[,-1]) = other ~ function(x) ifelse(is.numeric(x) & is.na(x), 0, x)
tbl[,pct_cols] = format(tbl[,pct_cols], digits=pct_digits, nsmall=pct_digits)
tbl[,n_cols] = format(tbl[,n_cols], digits=n_digits, nsmall=n_digits)
recode(tbl[,pct_cols]) = other ~ function(x) paste0(x, "%")
tbl
}
I understand how to format whole tables or columns (experts would have noticed the differences vs. the example in the URL), but what if I only wish to format specific cells? For instance, how to set digits=0
when value = 100,00% (to only show 100%) ?
I don't know if I should go for recode
, format
, when and where to reference tbl[,pct_cols]
...
Thank you!
The simplest way is to insert additional recodings into recode
in the function format_vert
. We can't use recoding in the form of '100.00' ~ '100' because columns are already aligned with spaces. So we use regular expressions. perl
means perl-style regex comparison and \\b
means word boundary. All values which will match with such expression will be recoded.
data(infert)
format_vert = function(tbl, pct_digits=2, n_digits=0){
#Finding columns to format
pct_cols = grepl("\\|%Col.$", names(tbl), perl = TRUE)
n_cols = grepl("\\|N$", names(tbl), perl = TRUE)
#Format
recode(tbl[,-1]) = other ~ function(x) ifelse(is.numeric(x) & is.na(x), 0, x)
tbl[,pct_cols] = format(tbl[,pct_cols], digits=pct_digits, nsmall=pct_digits)
tbl[,n_cols] = format(tbl[,n_cols], digits=n_digits, nsmall=n_digits)
recode(tbl[,pct_cols]) = c(
perl("\\b0.00\\b") ~ "0% ", # additional recodings
perl("\\b100.00\\b") ~ "100% ", # additional recodings
other ~ function(x) paste0(x, "%")
)
tbl
}
infert %>%
tab_cells(parity) %>%
### TOTAL
tab_cols(total()) %>%
tab_stat_cases(label="N", total_row_position="none") %>%
### OTHER VARIABLES
tab_cols(education) %>%
tab_stat_cases(label="N", total_row_position="none") %>%
tab_stat_cpct(label="%Col.", total_row_position="none") %>%
tab_pivot(stat_position="inside_columns") %>%
format_vert()