Search code examples
rregexgsubfn

Using gsubfn to replace many instances within a string


I wrote a function that transforms a string representing a number (magrittr loaded in my system):

adjust_perc_format <- function(x, n=3){
  gsub(",", ".", x, perl = T) %>% as.numeric() %>% format(nsmall=n, decimal.mark = ",")
}

So that:

adjust_perc_format("2,5", 3) 
[1] "2,500"

The goal is to transform the ocurrences defined by a regex within a string (see here). For this purpose I tried gsubfn:

str <- "20MG/ML (2,5%)+0,5%"
gsubfn("[\\d,]+(?=%)", function(x) adjust_perc_format(x, n=3),str)

The expected result is "20MG/ML (2,500%)+0,500%". Instead, I got the same input string ("20MG/ML (2,5%)+0,5%").

I also tried to set the engine as below, no success:

options(gsubfn.engine = "R")

What am I missing here? Thank you.


Solution

  • You need to tell gsubfn to use the whole match by passing the backref=0 argument:

    gsubfn("[\\d,]+(?=%)", function(x) adjust_perc_format(x, n=3),str, backref=0)
    

    R test:

    > library(gsubfn)
    > str <- "20MG/ML (2,5%)+0,5%"
    > gsubfn("[\\d,]+(?=%)", function(x) adjust_perc_format(x, n=3),str, backref=0)
    [1] "20MG/ML (2,500%)+0,500%"
    

    In case you want to make your pattern more reliable, you may use

    gsubfn("\\d+(?:,\\d+)*(?=%)", function(x) adjust_perc_format(x, n=3),str, backref=0)
    

    which will match one or more digits followed with zero or more occurrences of a comma and one or more digits (followed with % that is not consumed as it is in a positive lookahead).