Search code examples
rstringsubstring

extract substring between "-" and "-" in string in R


i have a list of string that looks like this:

list=["chr21-10139833-A-C","chry-10139832-b-f"]

for every string in the list i need to extract the numbers between "-" and "-"

so i would get:

[10139833,10139832]

i tried this :

gsub(".*[-]([^-]+)[-]", "\\1", list

but it returns :

[ac,bf]

what can i do to make it work ? thank you


Solution

  • Using str_extract from stringr we can try:

    list <- c("chr21-10139833-A-C", "chry-10139832-b-f")
    nums <- str_extract(list, "(?<=-)(\\d+)(?=-)")
    nums
    
    [1] "10139833" "10139832"
    

    We could also use sub for a base R option:

    list <- c("chr21-10139833-A-C", "chry-10139832-b-f")
    nums <- sub(".*-(\\d+).*", "\\1", list)
    nums
    
    [1] "10139833" "10139832"