Search code examples
rstringexpressionitalic

How to make only lower case characters italic in a mixed string


This question raised by trying to solve this one How to write partial string of X labels in italics using ggplot2?:

I want to know how we could only italicize characters in a string that are lower case:

string <- "wbfV/wcvB"
[1] "wbfV/wcvB"

Desired output:

  • wbfV/wcvB

Background: I would like to use it then for labelling in a plot.

I thought to do it like this way, but obviously it is not working:

library(stringr)

expression(str_detect(string, '[a-z]'~italic(str_detect(string, '[A-Z]'))))

which I tried to label

plot(1, xlab=expression(str_detect(string, '[a-z]'~italic(str_detect(string, '[A-Z]')))))

enter image description here


Solution

  • I'm not really familiar with using expressions in R directly, I have only ever used latex2exp, so I'll be using it here as well. The key to this task is doing the right split with lookarounds. Then you can easily make every other substring italic.

    library(latex2exp)
    library(stringr)
    library(purrr)
    
    "wbfV/wcvBa" |> 
      str_split("(?<=[a-z])(?![-a-z])|(?<![-a-z])(?=[a-z])") |>
      unlist() |> 
      imap_chr(\(x,i) ifelse(i %% 2, x, str_c("\\textit{", x, "}"))) |> 
      str_c(collapse = "") |> 
      TeX() %>%
      plot(1, xlab = .)
    

    Created on 2022-07-30 by the reprex package (v2.0.1)

    The regex consists of two parts with two lookarounds each:

    Split between either 
    (?<=[a-z])   lowercase letter 
    (?![-a-z])   followed by non-lowercase
    |            OR 
    (?<![-a-z])  non-lowercase
    (?=[a-z])    followed by lowercase