Search code examples
rregexgsubfn

Native regex way to replace multiple leading chars with equal number spaces


I have some strings that are spaced as I want but that have leading digits that I don't want. I want to replace each of these leading digits with an equal number of spaces so as to maintain the spacing. I can do this with the gsubfn package but am curious if there's a native R regex way to accomplish this task.

Can I accomplish the same result as below using only native R regex functions?

MWE:

library(gsubfn)

string <- c(
    "1    12  end line", 
    "10   3   end line", 
    "50   444 end line", 
    "100  54  end line", 
    "1000 5   end line"
)

gsubfn('^\\d+', function(x) gsub('\\d', ' ', x), string)

Desired Result:

[1] "     12  end line"
[2] "     3   end line"
[3] "     444 end line"
[4] "     54  end line"
[5] "     5   end line"

Solution

  • You want to replace each single digit at the start of the string with a space.

    Use

    > gsub("\\G\\d", " ", string, perl=TRUE)
    [1] "     12  end line" 
    [2] "     3   end line" 
    [3] "     444 end line"
    [4] "     54  end line" 
    [5] "     5   end line"
    

    See the online regex demo (a bit modified to work with a multiline string input).

    The \G\d pattern matches the start of string or the end of the previous successful match (with \G) and then matches a digit that is replaced with a space.