Search code examples
rstringstringrstringi

How to check if a string is made up entirely of certain string patterns


I have a vector of strings which I need to check to see if they fit a certain criteria. For example, if a certain string, say "34|40|65" is made up entirely of these patterns: c("34", "35", "37", "48", "65"), then I want to return 1, if they string does not contain any of these patterns, then I want to return -1. If the string contains some patterns, but is not totally made up of these patterns, then I want to return 0. I have successfully achieved 1 and -1, but am having trouble with the logic which would yield 0. As stands, my logic yields 1 for those strings which should yield 0. Here is my code to determine if the string contains one of these patterns. This would give me the 1s.

acds <- c("34", "35", "37", "48", "65")
grepl(paste(acds, collapse = "|"), data$comp_cd)

data$comp_cd is the vector of strings

Thanks!


Solution

  • You can check the matches with:

    sapply(strsplit(string,"\\|"), function(x) x %in% patterns)
    

    You can easily wrap this in a function to give the numerical result as requested.

    checkstring <-function(string,patterns)
    {
      matches = sapply(strsplit(string,"\\|"), function(x) x %in% patterns)
      if(sum(matches)==length(matches))
        return(1)
      if(sum(matches)==0)
        return(-1)
      else
        return(0)
    }
    

    Example of usage:

    checkstring("34a|65a",patterns=patterns)
    [1] -1
    checkstring("34|65",patterns=patterns)
    [1] 1
    checkstring("34|40|65",patterns=patterns)
    [1] 0
    

    Hope this helps!