Search code examples
rstrsplit

How to split letters with bracket and numbers in R?


The string is s = '[12]B1[16]M5'

I want to split it as the following results with strsplit function in R:

let <- c('[12]B', '[16]M')

num <- c(1, 5)

Thanks a lot


Solution

  • You could use regular expression for your task.

    s = '[12]B1[16]M22'
    
    grx <- gregexpr("\\[.+?\\].+[[:digit:]]?",  s)
    let <- do.call(c, regmatches(s, grx))
    
    #let
    #[1] "[12]B" "[16]M"
    

    If you want to get all chunks (let + num), you can tweak the patter as below. This facilitates extracting the numeric part.

    grx <- gregexpr("\\[.+?\\].+([[:digit:]]+)",  s)
    out <- do.call(c, regmatches(s, grx))
    
    num <- gsub(".+\\][[:alpha:]]+", "", out)
    
    num
    [1] "1"  "22"