Search code examples
pdfwindows-10r-markdownkableextra

Cannot knit rmarkdown file in PDF with kableExtra table including UTF characters


On a fresh install of R, Rtools, Rstudio on Windows 10, I am unable to knit in PDF (pdflatex or xelatex) the following rmarkdown file consisting in a kableExtra table including "é" characters in cells.

---
title: "test"
output:
  pdf_document:
    keep_tex: yes
    latex_engine: pdflatex
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(kableExtra)
```
```{r reglementation-eu}
reglementation_table_eu <- data.frame(
  polluant = c(rep("PM10", 2), rep("PM25", 3), rep("O3", 4), rep("NO2", 3), rep("SO2", 3), "CO"),
  moyenne  = c("1 jour", "Année calendrier", rep("Annee calendrier", 3), rep("Maximum journalier de la moyenne glissante sur 8 heures", 2), rep("1 heure", 2), rep("1 heure", 2), "Annee calendrier", rep("1 heure", 2), "1 jour", "Maximum journalier de la moyenne glissante sur 8 heures"),
  limite   = c(rep("CO", 16)),
  comment  = c(rep("CO", 16))
)
validUTF8(reglementation_table_eu$moyenne)
kable(reglementation_table_eu, col.names = c("Polluant", "Période de moyenne", "Concentration légale", "Commentaires"), booktabs = T, align = "l", caption = "Europese kwaliteitsnormen voor de omgevingslucht ter bescherming van de menselijke gezondheid.") %>%
  column_spec(2, width = "2.5cm") %>%
  column_spec(3, width = "4.5cm") %>%
  column_spec(4, width = "4.5cm") %>%
  row_spec(0, bold = T) %>%
  collapse_rows(columns = 1, latex_hline = "full", valign = "top")
```

Error message when knitting in PDF is

Error in temp_sub(target_row, new_row, out, perl = T) : 'pattern' is invalid UTF-8 Calls: <Anonymous> ... column_spec -> column_spec -> column_spec_latex -> temp_sub

Knitting in HTML perfectly works, except that the function validUTF8 returns

## [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
## [13]  TRUE  TRUE  TRUE  TRUE

which could be a clue of the problem in PDF.

Am I missing something ? Thanks !

> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_Belgium.1252  LC_CTYPE=English_Belgium.1252    LC_MONETARY=English_Belgium.1252 LC_NUMERIC=C                    
[5] LC_TIME=English_Belgium.1252    
system code page: 65001

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] kableExtra_1.3.4 rmarkdown_2.8   

loaded via a namespace (and not attached):
 [1] rstudioapi_0.13   knitr_1.33        xml2_1.3.2        magrittr_2.0.1    rvest_1.0.0       munsell_0.5.0     colorspace_2.0-1  viridisLite_0.4.0
 [9] R6_2.5.0          rlang_0.4.11      stringr_1.4.0     httr_1.4.2        tools_4.1.0       webshot_0.5.2     xfun_0.23         htmltools_0.5.1.1
[17] systemfonts_1.0.2 yaml_2.2.1        digest_0.6.27     lifecycle_1.0.0   glue_1.4.2        evaluate_0.14     stringi_1.6.2     compiler_4.1.0   
[25] scales_1.1.1      svglite_2.0.0

Solution

  • I finally found the cause, almost by accident...

    The new laptop on which I wanted to use Rstudio has a brand new installation of Windows 10 Pro. I seems that, by default, in "Region Settings" window the beta feature "Use Unicode UTF-8 for worldwide language support" is activated. This results in the previous error message when compiling a kable(Extra) with UTF-8 characters in PDF.enter image description here Unselecting it solved the issue.