Search code examples
rlatexknitrkableextraflextable

How to show pipe character in kableExtra table in PDF?


I found a nice explanation of data validation on RPubs: https://rpubs.com/techanswers88/validate-data-in-r. I have tried to reproduce the steps in an R notebook. All the steps work perfectly until I get to the tables with the results. The example uses flextable to tabulate the errors from validating the example dataset. The flextable shows well when I knit the notebook preview:flextable in Notebook preview. However, when I knit to PDF the same flextable looks terrible: Flextable in PDF. So I went to kable/KableExtra, as I am more familiar with there packages and I style most of my tables with these. Kable/kableExtra usually works very well with both the R Notebooks and PDF. With Kable/KableExtra I get a more compact table which I prefer. However, some of the validation criteria in the table contain the pipe character (e.g. "(DBP == 0 | SBP ==0) & Outcome == 'Survived' )". The pipe character shows as '|' in the table. This specifically happens as soon as I pass the table setup through KableExtra's styling options.

In the MWE below I have left out the graphics of the original example I tried to reproduce and only focus on the troublesome table. R version used: 4.3.2; dplyr 1.1.4; Flextable 0.9.4; kableExtra 1.4.0; validate 1.1.5.

MWE:

    ---
    title: "kableExtra vs flextable"
    output:
       pdf_document:
         toc: true
         number_sections: true
       html_notebook:
         toc: true
         number_sections: true
    ---
 
    ```{r}
    #Load Libraries
    library(dplyr)
    library(validate)
    library(flextable)
    library(kableExtra)
    ```
 
    ```{r}
    #Create dataset (from https://rpubs.com/techanswers88/validate-data-in-r)
    myData <- data.frame(PatientID = c('P001', 'P002', 'P003' , NA, 'P005', 'P006', 'P007', 'P008', 
                                        'P009', 'P010', 'P008', 'P11'), 
     Age = c(23, 12, 5, 8, 245, NA, 23, 45, 87, 121, 56, 130), 
     Outcome = c('Died',  'Died', NA, 'Survived', 'Survived', 'Survived', 'Survived',
                                   'Survived', 'Survived', '?', 'Survived', 'Unknown'), 
     SBP = c(0, 0, 120, 80, 45, 67, 100, 130, 350, 120, 46, 120), 
     DBP = c(0, 0, 80, 70, 30, 40, 80, 210, NA, 80, 0, 80)
                      )
 
    #Add an id field in the data. 
    myData <- myData%>%
     dplyr::mutate(id  = row_number())
    ```
 
    ```{r}
    #Create validation rules
    myrules <- validator( "Patient Id Unique"   = is_unique(PatientID),
                           "Patient ID"          = is.na(PatientID),
                           "Age in range"        = Age >= 0 & Age <= 120,
                           "Outcome validity "   = Outcome %in% c('Survived', 'Died'),
                           "SBP in range"        = SBP >= 0 & SBP <= 300,
                           "DBP in range"        = in_range (DBP, min = 0 , max =  200),
                           "Survived with No BP" = (DBP == 0 | SBP ==0)  & Outcome == 'Survived' )
    ```
 
    ```{r}
    #create validation results with the confront function
    output <- confront(myData, myrules, key  ="id")
    ```
 
    ```{r}
    #Show errors in table
    #Convert our output into a dataframe first
    dout <- as.data.frame(output)
    ```
 
    ```{r}
    #Errors for each record id
    dErrors <- dout%>%
       dplyr::filter(! value == TRUE)%>%
       dplyr::select(id, name, expression)%>%
       dplyr::arrange(id)
 
    #Display validation results in flextable
    #This part works well in a R Notebook or HTML, but not in PDF.
    ft <- flextable(dErrors)%>%
       theme_booktabs()%>%
       merge_v(j = ~id)%>%
       set_header_labels(name       = "Error",
                         id         =  "Record ID",
                         expression = "Validation expression")
    ft
    ```
 
    ```{r}
    #Try to reproduce the flextable with kable/kableExtra
    #This produces a more compact table but unfortunately without the pipe character
    dout %>% 
       filter(!value==TRUE) %>% 
       select(id, name, expression) %>% 
       arrange(id) %>% 
       kable(booktabs = TRUE, escape = TRUE) %>% 
       kable_classic(full_width = FALSE) %>% 
       column_spec(3, width = "10cm") %>% 
       collapse_rows(columns = 1, valign = "top")
    ```

I have found that if I only sent the results through kable but not kableExtra

```{r}
dout %>% 
  filter(!value==TRUE) %>% 
  select(id, name, expression) %>% 
  arrange(id) %>% 
  kable(booktabs = TRUE, escape = TRUE)
```

then the pipe symbol will show in the table . However as soon as I try to do any styling (e.g. collapse_rows() or kable_styling) then the pipe symbol will be changed to '&#124;'. Passing "format = "latex" to kable does not solve the problem, nor does "escape = TRUE". As said, the pipe character will show in the table in PDF up until the moment I add any styling. I can't find a solution online or in the manual of kableExtra. I have tried updating R and the packages but without result.

I tried to 'fix' flextable to see if I could make that work to get a better looking table, but without success. I have tried to change the width of the flextable with ft <- fit_to_width(ft, max_width = 15, unit = "cm"). However this does not do anything with the flextable width in the PDF and all text remains smashed together. Some advice online suggest to use 'latex_engine: xelatex' in the YAML header however this does not have any effect either.

I would prefer to use kableExtra as I am more familiar with that package. Any advice on how to show the pipe character when using kableExtra?

Thanks in advance.


Solution

  • This seems to work if you use the format option of kable:

        dout %>% 
           filter(!value==TRUE) %>% 
           select(id, name, expression) %>% 
           arrange(id) %>% 
           kable(booktabs = TRUE, format = "html") %>% 
           kable_classic(full_width = FALSE) %>% 
           column_spec(3, width = "10cm") %>% 
           collapse_rows(columns = 1, valign = "top")
    

    If you want to knit either to HTML or to LaTeX, you can do:

    fmt <- ifelse(knitr::is_html_output(), "html", "latex")
        dout %>% 
           filter(!value==TRUE) %>% 
           select(id, name, expression) %>% 
           arrange(id) %>% 
           kable(booktabs = TRUE, format = fmt) %>% 
           kable_classic(full_width = FALSE) %>% 
           column_spec(3, width = "10cm") %>% 
           collapse_rows(columns = 1, valign = "top")