Search code examples
rhtmlutf-8slidify

R, utf-8 characters seem to fail in slidify


Update 3: I seem to have fixed the problem via close coordination of setting fileEncoding parameters and using the function slidify(file.Rmd, encoding="UTF-8"), the links above will help, as including the fix-encode branch of slidify was required in addition to slidify with install_github('ramnathv/slidify@fix_encode'), but it should be included in the main set probably very soon. I'll keep this here for archival purposes. Can't say the exact thing that fixed it, but the problem was probably due to input files being of different encoding than the script file, as the default encodings seemed to cause some problems. I'm still somewhat unsure of what exactly happened, but I am grateful and amazed at Ramnathv, the maintainer of slidify, in actively participating in all these discussions. The story of this fix is also in the links above.

I'll keep this post online just for archival purposes, perhaps someone else will end up in a very similar situation to where I was at.

Update 2: There have been issues with encoding with slidify before, some of which have been solved, more info in these links: https://github.com/ramnathv/slidify/issues/377, http://kohske.github.io/ESTRELA/201412/index.html, https://github.com/ramnathv/slidify/issues/373, https://github.com/Koalha/landslide/blob/master/index.Rmd

Update: The issue seems to occur when opening a file that contains utf-8 characters. Writing r "õ,ä,ü" within code brackets in text seems to work fine, but when opening a file with variables containing utf-8 characters causes problems in the script. These problems don't occur when using knitr to compile to html5. Anyone know what the differences between the two could be?

key <- read.csv("key1.csv", header = TRUE, sep = ";", 
        quote = "\"", dec = ".", fill = TRUE, comment.char = "")
print(key)

Adding and encoding parameter, or fileEncoding did not seem to help, html knitr doesn't seem to need it.

Edit: Still not solved, but I may be closer to the the cause of the situation. It turns out that while the Rmd itself was saved as UTF-8, many of the datasets used still reverted to ANSI on storage. Possibly an issue of mixed file types then. Best case slidify(file.Rmd, encoding="UTF-8") seems to compile the .md with characters correctly encoded, however during data processing errors due to encoding mismatches have happened already. These issues have not occurred on regular knitr Rmd to html conversions.

I'm working on converting an html Rmarkdown document into ioslides type Rmarkdown document, and I've stumbled onto an unexpected issue. The document works with no issues on regular knitr construction, but seems to run into character encoding issues with exactly the same code, environment and directory in slidify.

Namely, that knitr seems to crash when encountering a variable that contains a non-latin character (such as ä,ü,ö) and when it output text contains these characters they are replaced with "?" rectangles.

I am using the setup based on Claas-Thido Pfaff's example: http://cpfaff.github.io/reproducibility/. I haven't yet found a place where it would specify or alter the locale (as exactly the same document works fine with html output.

---
title       : Reproducible Reports
subtitle    : With R, Knitr, LaTeX and Markdown 
author      : Claas-Thido Pfaff 
job         : http://cpfaff.github.io/reproducibility
framework   : io2012        # {io2012, html5slides, shower, dzslides, ...}
highlighter : highlight.js  # {highlight.js, prettify, highlight}
hitheme     : tomorrow      # 
widgets     : [mathjax, bootstrap]            # {mathjax, quiz, bootstrap}
mode        : selfcontained # {standalone, draft}
knit        : slidify::knit2slides
github      :
  author    : changed
  repo      : reproducibility
---

```{r echo = FALSE, include = F, eval = T}
require(knitr)
hook_source_def = knit_hooks$get('source')
knit_hooks$set(source = function(x, options){
  if (!is.null(options$verbatim) && options$verbatim){
    opts = gsub(",\\s*verbatim\\s*=\\s*TRUE\\s*", "", options$params.src)
    bef = sprintf('\n\n    ```{r %s}\n', opts, "\n")
    stringr::str_c(bef, paste(knitr:::indent_block(x, "    "), collapse = '\n'), "\n    ```\n")
  } else {
     hook_source_def(x, options)
  }
})

require(ggplot2)
```

Any ideas? Thanks!


Solution

  • As recommended by @Nisse-Engström, I post the final solution as an answer instead of an update for the original question. I believe the fix encode should be integrated into the main slidify package by now.

    I seem to have fixed the problem via close coordination of setting fileEncoding parameters and using the function slidify(file.Rmd, encoding="UTF-8"), the links above will help, as including the fix-encode branch of slidify was required in addition to slidify with install_github('ramnathv/slidify@fix_encode'), but it should be included in the main set probably very soon. I'll keep this here for archival purposes. Can't say the exact thing that fixed it, but the problem was probably due to input files being of different encoding than the script file, as the default encodings seemed to cause some problems. I'm still somewhat unsure of what exactly happened, but I am grateful and amazed at Ramnathv, the maintainer of slidify, in actively participating in all these discussions. The story of this fix is also in the links above.