Search code examples
regexstringrstringi

Regular expression to search and replace a string in a file


Hi friends I am trying to search particular keywords (given in txt) in a list of files.I am using a regular expression to detect and replace the occurrence of the keyword in a file. Below is a comma separated keywords that i am passing to be searched.

library(stringi)
txt <- "automatically got activated,may be we download,network services,food quality is excellent"

Ex "automatically got activated" should be searched and replaced by automatically_got_activated..."may be we download" replaced by "may_be_we_download" and so on.

txt <- "automatically got activated,may be we download,network services,food quality is excellent"

for(i in 1:length(txt)) {
    start <- head(strsplit(txt, split=" ")[[i]], 1) #finding the first word of the keyword 
    n <- stri_stats_latex(txt[i])[4]        #number of words in the keyword

    o <- tolower(regmatches(text, regexpr(paste0(start,"(?:[^a-zA-Z'-]+[a-zA-Z'-]+){0,",
        n-1,"}"),text,ignore.case=TRUE)))   #best match for keyword for the regex in the file 

    p <- which(!is.na(pmatch(txt, o)))      #exact match for the keywords
}

Solution

  • I think this may be what you're looking for.

    > txt <- "automatically got activated,may be we download,network services,food quality is excellent"
    

    A made-up vector of sentences to search from:

    > searchList <- c('This is a sentence that automatically got activated',
                      'may be we download some music tonight',
                      'I work in network services',
                      'food quality is excellent every time I go',
                      'New service entrance',
                      'full quantity is excellent')
    

    A function to do the work:

    replace.keyword <- function(text, toSearch)
    {
        kw <- unlist(strsplit(txt, ','))
        gs <- gsub('\\s', '_', kw)
        sapply(seq(kw), function(i){
          ul <- ifelse(grepl(kw[i], toSearch),
                       gsub(kw[i], gs[i], toSearch),
                       "")
          ul[nzchar(ul)]
        })
    }
    

    The results:

    > replace.keyword(txt, searchList)
    # [1] "This is a sentence that automatically_got_activated"
    # [2] "may_be_we_download some music tonight"              
    # [3] "I work in network_services"                         
    # [4] "food_quality_is_excellent every time I go"   
    

    Let me know if it works for you.