Search code examples
rgsubsubstrstringi

How to replace characters in a string vector based on a position vector in R?


For an example:

set.seed(123)
library(stringi)
df<-data.frame(p=sprintf("%s", stri_rand_strings(11, 11, '[A-Z]')), 
               n=sample(1:10, 11, 1),
               s=sprintf("%s", stri_rand_strings(11, 1, '[A-Z]')))
df
             p  n s
1  GPCMCEHPTEW  3 X
2  STDJRNJGBGX  8 P
3  VTEDZLMEPHF  6 L
4  RHVCVLTRLQA  4 Y
5  FSFVIRYDDRL  7 S
6  VZBLSCZGBRU 10 K
7  JJHCJENNYIM  8 A
8  CWKTELUBVHJ  4 O
9  IANRXAZHYRL 10 M
10 VBTJVNHUCVH  9 W
11 TZCWUKIFOXN  6 V

What I wanted is to create a new column new_p where the character in p at position n is replaced by s. Thus the first df$new_p[1] should be GPXMCEHPTEW.


Solution

  • An option would be substring

    for(i in seq_len(nrow(df)))  substring(df$p[i], df$n[i], df$n[i]) <- df$s[i]
    
    
    df
    #             p  n s
    #1  GPXMCEHPTEW  3 X
    #2  STDJRNJPBGX  8 P
    #3  VTEDZLMEPHF  6 L
    #4  RHVYVLTRLQA  4 Y
    #5  FSFVIRSDDRL  7 S
    #6  VZBLSCZGBKU 10 K
    #7  JJHCJENAYIM  8 A
    #8  CWKOELUBVHJ  4 O
    #9  IANRXAZHYML 10 M
    #10 VBTJVNHUWVH  9 W
    #11 TZCWUVIFOXN  6 V
    

    We could also make use of rawToChar/charToRaw

    df$p <- mapply(function(x, y, z) rawToChar(replace(charToRaw(x), y, 
             charToRaw(z))), df$p, df$n, df$s)
    

    data

    df <- structure(list(p = c("GPCMCEHPTEW", "STDJRNJGBGX", "VTEDZLMEPHF", 
    "RHVCVLTRLQA", "FSFVIRYDDRL", "VZBLSCZGBRU", "JJHCJENNYIM", "CWKTELUBVHJ", 
    "IANRXAZHYRL", "VBTJVNHUCVH", "TZCWUKIFOXN"), n = c(3L, 8L, 6L, 
    4L, 7L, 10L, 8L, 4L, 10L, 9L, 6L), s = c("X", "P", "L", "Y", 
    "S", "K", "A", "O", "M", "W", "V")), class = "data.frame",
    row.names = c("1", 
    "2", "3", "4", "5", "6", "7", "8", "9", "10", "11"))