I am rewriting my vb.net code in R and have come to a roadblock. The code in vb.net essentially counts the number of characters in a string that do not occur in a string of allowed characters. The code in vb.net is:
StringtoConvert="ABC"
strAllowedChars="AC"
For i= 1 to len(StringtoConvert)
If InStr(1, strAllowedChars, StringtoConvert(i))=0 then
disallowed=disallowed+1
Else
End If
Next
I can see how to do this in R using loops to search the string for each of the allowed characters but is there a way in R to do this using an aggregate like the strAllowedChars above?
The str_count function of the stringr package in R is the closest that I have found but it looks matches to the entire strAllowedChars rather than looking at each character independently. How can I test the StringtoConvert to make sure it contains only the strAllowedChars as individual characters. In other words in the example above if a character in StringtoConvert does not match one of the characters in strAllowedCharacters then I need to either identify it as such and use another call to replace it or replace it directly.
The R code that I have tried is:
library(stringr)
testerstring<-"CYA"
testpattern<-"CA"
newtesterstring<-str_count(testerstring,testpattern)
print(newtesterstring)
The desired output is the number of characters in the StringtoConvert that are disallowed based on the allowed characters-strAllowedChars. I will then use that in a loop to change any disallowed character to a "G" using an if then statement so it would also be desirable if I could skip the step of counting and instead just replace any disallowed character with a "G".
Here's an approach with str_replace_all
. We can generate a regular expression to identify characters that are not in a set. For example, [^AC]
matches any characters not A
or C
:
library(stringr)
StringtoConvert="ABC"
strAllowedChars="AC"
str_replace_all(StringtoConvert,paste0("[^",strAllowedChars,"]"),"G")
#[1] "AGC"
set.seed(12345)
sample(LETTERS,50,replace = TRUE) %>% paste(collapse = "") -> StringtoConvert2
str_replace_all(StringtoConvert2,paste0("[^",strAllowedChars,"]"),"G")
#[1] "GGGGGGGGGGGGGGGGGGAGGGGGCGGGGGGGGGGGGGGGGGGGGGGGGG"