Search code examples
riconv

Convert non-ASCII to character representations beginning with backslash u (\u) in R?


Running R CMD check --as-cran gives

Portable packages must use only ASCII characters in their R code,
except perhaps in comments.
Use \uxxxx escapes for other characters.

What are \uxxxx, and more importantly, how can I convert non ASCII characters into them?

What I know so far

  • ?iconv is very informative, and looks powerful, but I see nothing of the form \u
  • this python documentation indicates \uxxxx are

Character with 16-bit hex value xxxx (Unicode only)

Question

How can I convert non-ASCII characters into character representations of the form \uxxxx

Some examples c("¤", "£", "€", "¢", "¥", "₧", "ƒ")


Solution

  • You have stri_escape_unicode from stringi to escape unicode:

    stringi::stri_escape_unicode(c("¤", "£", "€", "¢", "¥", "₧", "ƒ"))
    ## [1] "\\u00a4" "\\u00a3" "\\u20ac" "\\u00a2" "\\u00a5" "P"       "\\u0192"
    

    I have an addin based on that to remove non ascii characters between quotes in function here : https://github.com/dreamRs/prefixer