Search code examples
rpackagelatexasciird

How to insert accented characters in ASCII-encoded Rd files?


Can LaTeX escapes for accents be used in Rd files? I tried the standard \'e and many variants (\'{e}, {\'e}, \\'e, \\'{e}, {\\'e}, etc.), but none is rendered as an accented character in the PDF or HTML output.

I want my References section (i.e. \references{}) to be rendered with accented characters, but I do not want to type non-ASCII characters in my Rd files. Is there good/recommended practice? Should I simply replace non-ASCII characters with their ASCII equivalents (é → e, ø → o)?

To be clear, I know it is possible to type accented characters (e.g., é) directly in UTF-8-encoded files, but I would prefer to keep ASCII-encoded files.

This question is not about:

or variants.

Minimal test package

Package structure:

test
test/man
test/man/one.Rd
test/R
test/R/one.R
test/DESCRIPTION

test/man/one.Rd:

\name{one}
\alias{one}
\title{Get One}
\description{Accents are not rendered: \'e \'{e} {\'e} \\'e \\'{e} {\\'e}}
\usage{
one()
}

test/R/one.R:

one <- function() 1

test/DESCRIPTION:

Package: test
Version: 0.1
Title: Test
Author: Nobody
Maintainer: Nobody <[email protected]>
Description: Test.
License: GPL-3

Build, check, and install with:

$ R CMD build test
$ R CMD check test_0.1.tar.gz
$ R CMD INSTALL test_0.1.tar.gz

Solution

  • Rd syntax is only LaTeX-like: it supports a limited set of macros, but these are not guaranteed to behave like their LaTeX counterparts, if any exist. Conversely, very few LaTeX macros have Rd equivalents.

    This section of the Writing R Extensions manual describes most of the built-in Rd macros. Tables 1, 2, and 3 of this technical paper provide a complete list.

    This section of WRE addresses encoding specifically. Use of non-ASCII characters in Rd files is supported, provided that you declare an appropriate encoding, either in the files themselves (via \enc and \encoding) or in DESCRIPTION (via the Encoding field). However, restriction to ASCII characters is encouraged:

    Wherever possible, avoid non-ASCII chars in Rd files, and even symbols such as ‘<’, ‘>’, ‘$’, ‘^’, ‘&’, ‘|’, ‘@’, ‘~’, and ‘*’ outside ‘verbatim’ environments (since they may disappear in fonts designed to render text).

    The recommended way to obtain non-ASCII characters in rendered help without including non-ASCII characters in your Rd files is to use conditional text. \ifelse allows you to provide raw LaTeX for PDF help pages, raw HTML for HTML help pages, and verbatim text for plain text help pages:

    \ifelse{latex}{\out{\'{e}}}{\ifelse{html}{\out{&eacute;}}{e}}
    

    That is quite verbose, so I would suggest defining your own macro(s) in man/macros/macros.Rd. You can do this with \newcommand and \renewcommand:

    \newcommand{\eacute}{\ifelse{latex}{\out{\'{e}}}{\ifelse{html}{\out{&eacute;}}{e}}}
    

    Then you can use \eacute{} freely in all of your Rd files. To check that text is rendered the way you want in all formats, install the package and run help(topic, help_type=), with help_type equal to "text", "html", or "pdf".