Search code examples
rcharacter-encodingknitr

farsi character in knitr


In R 3.3.1, I'm using knitr 1.13, and I'm having trouble recognizing Farsi characters. I save the following code in a file called test.Rnw and then run knit("test.Rnw"), but the Farsi characters change to ????? in output.

\documentclass{article}
\usepackage{xepersian}%package for type Farsi in LaTex
\begin{document}
این یک متن آزمایشی است
<<model>>=
fit <- lm(dist ~ speed, data = cars)
@
برای آزمایش داریم:
\Sexpr{coef(fit)[2]}
\end{document}

Solution

    1. Ensure that the file is saved in UTF-8 encoding
    2. On most operating systems, that’s enough, since UTF-8 is the system’s default encoding anyway. Especially on Windows, you additionally need to specify the encoding when knitting, e.g.:

      knit('filename.rnw', encoding = 'UTF-8')
      

    Futhermore, you need explicitly handle reading direction of your text: Farsi is right-to-left, but the (Western) source code requires left-to-right text direction. This needs to be set explicitly in the code using e.g. \(un)setLTR:

    \documentclass{article}
    \usepackage{xepersian}
    % Set a font for the text
    \settextfont{XB Niloofar}
    
    \begin{document}
    این یک متن آزمایشی است
    \setLTR
    <<model>>=
    fit <- lm(dist ~ speed, data = cars)
    @
    \unsetLTR
    برای آزمایش داریم:
    \Sexpr{coef(fit)[2]}
    \end{document}
    

    You could also use a Knitr hook to set this automatically for each code chunk. You could probably adapt example 74 for this:

    <<setup, include=FALSE>>=
    library(knitr)
    knit_hooks$set(ltr = function(before, options, envir) {
        if (before) '\\setLTR' else '\\unsetLTR'
    })
    @
    

    Now you can write a code chunk as follows:

    <<model, ltr=TRUE>>
    fit <- lm(dist ~ speed, data = cars)
    @
    

    No need for manual \setLTR\unsetLTR any more.