Search code examples
rstringstringr

How to replace characters in a string one at a time generating new string for each replacement?


I have a vector of strings

c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK", 
"LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", "SYASDFGSSAK", 
"LYSYYSSTESK")

for each string I would like to replace "Y", "S" or "T" with "pY", "pS" or "pT". But I dont want all the replacements to be in the same final string, I want each replacement to generate a new string, e.g.

"YSAHEEHHYDK" turns into

c("pYSAHEEHHYDK",
"YpSAHEEHHYDK",
"YSAHEEHHpYDK")

Solution

  • You could write a function in base R:

    Edit:

    Included the notion of zero-length as shown by @GKi

    strings <-  c("YSAHEEHHYDK", "HEHISSDYAGK", "TFAHTESHISK", "ISLGEHEGGGK", 
                  "LSSGYDGTSYK", "FGTGTYAGGEK", "VGASTGYSGLK", "TASGVGGFSTK", 
                  "SYASDFGSSAK", "LYSYYSSTESK")
    
    
    reg <- gregexpr("[YST]", strings)
    `regmatches<-`(rep(strings, lengths(reg)), 
                  `attr<-`(unlist(reg), "match.length", 0),  value = 'p')
    
    #>  [1] "pYSAHEEHHYDK" "YpSAHEEHHYDK" "YSAHEEHHpYDK" "HEHIpSSDYAGK" "HEHISpSDYAGK"
    #>  [6] "HEHISSDpYAGK" "pTFAHTESHISK" "TFAHpTESHISK" "TFAHTEpSHISK" "TFAHTESHIpSK"
    #> [11] "IpSLGEHEGGGK" "LpSSGYDGTSYK" "LSpSGYDGTSYK" "LSSGpYDGTSYK" "LSSGYDGpTSYK"
    #> [16] "LSSGYDGTpSYK" "LSSGYDGTSpYK" "FGpTGTYAGGEK" "FGTGpTYAGGEK" "FGTGTpYAGGEK"
    #> [21] "VGApSTGYSGLK" "VGASpTGYSGLK" "VGASTGpYSGLK" "VGASTGYpSGLK" "pTASGVGGFSTK"
    #> [26] "TApSGVGGFSTK" "TASGVGGFpSTK" "TASGVGGFSpTK" "pSYASDFGSSAK" "SpYASDFGSSAK"
    #> [31] "SYApSDFGSSAK" "SYASDFGpSSAK" "SYASDFGSpSAK" "LpYSYYSSTESK" "LYpSYYSSTESK"
    #> [36] "LYSpYYSSTESK" "LYSYpYSSTESK" "LYSYYpSSTESK" "LYSYYSpSTESK" "LYSYYSSpTESK"
    #> [41] "LYSYYSSTEpSK"
    

    Created on 2023-02-14 with reprex v2.0.2

    You can create a small function to help you out.

    my_replace <- function(x){
      reg <- gregexpr("[YST]", x)
      `regmatches<-`(rep(x, lengths(reg)), structure(unlist(reg), match.length = 0), value = "p")
    }