Search code examples
rpositionsubstring

Split column values from c(a,b) to a and b in R data frame


I want to find the different positions of a text within a substring. Say I have the data frame as below:

Key String
10  09123022130908123
11  01230012780981093
12  12387109387126309

Not sure how to put this in a table form here, but the idea is each key has a long string of numbers. To find the location of the text '09' in each string, I used the code:

df$try<-gregexpr(pattern ='09',df$string)

This gave me the table as

Key String            try
10  09123022130908123 c(1,11)
11  01230012780981093 c(11,15)
12  12387109387126309 c(7,16)

Now I want pure numbers in different columns rather than a single column containing c(a,b). How can I split such values into a and b under different columns? Any other suggestion to get all the positions of the required text within a substring are welcome. Thanks


Solution

  • Maybe not super beautiful but working. First your data:

    df <- data.frame(
      key = c(10,11,12,13),
      string = c( 
        "09123022130908123",
        "01230012780981093",
        "12387109387126309",
        "88888888888888809" 
      )
    )
    

    I use here lapply and a function, whichmatch gives you the first, second, etc. match

    searchString <- function( string, whichmatch) {
        x <- unlist(gregexpr(pattern ='09', string ))[whichmatch]
        return(x)
    } 
    df$a <- lapply( df$string, FUN = function(x) { searchString( x, 1 ) })
    df$b <- lapply( df$string, FUN = function(x) { searchString( x, 2 ) })
    rm(searchString)
    
      key            string  a  b
    1  10 09123022130908123  1 11
    2  11 01230012780981093 11 15
    3  12 12387109387126309  7 16
    4  13 88888888888888809 16 NA