Search code examples
regexrpositiongsub

R - gsub a specific character of a specific position


I would like to delete the last character of a variable. I was wondering if it is possible to select the position with gsub and delete the character at this particular position.

In this example, I want to delete the last digit in the end, after the E, for my 4 variables.

variables = c('B10243E1', 'B10243E2', 'B10243E3', 'B10243E4')
gsub(pattern = '[[:xdigit:]]{8}.', replacement = '', x = variables)

I thought we could use the command

{}

in order to select a specific position.


Solution

  • You can do it by capturing all the characters but the last:

    variables = c('B10243E1', 'B10243E2', 'B10243E3', 'B10243E4')
    gsub('^(.*).$', '\\1', variables)
    

    Explanation:

    • ^ - Start of the string
    • (.*) - All characters but a newline up to
    • .$ - The last character (captured with .) before the end of string ($).

    Thus, this regex is good to use if you plan to remove the final character, and the string does not contain newline.

    See demo

    Output:

    [1] "B10243E" "B10243E" "B10243E" "B10243E"  
    

    To only replace the 8th character (here is a sample where I added T at the end of each item):

    variables = c('B10247E1T', 'B10243E2T', 'B10243E3T', 'B10243E4T')
    gsub('^(.{7}).', '\\1', variables)
    

    Output of the sample program (not ET at the end of each item, the digit was removed):

    [1] "B10247ET" "B10243ET" "B10243ET" "B10243ET"