Search code examples
rstringsplitfixedstringr

R: Is it possible to split according to various characters with str_split_fixed?


I have a string that I want to divide by various parts.

test = c("3 CH • P" ,"9 CH • P" , "2 CH • P" , "2 CH, 5 ECH • V",                 
 "3 ECH • V",  "4 ECH • P" )

I know that using str_split_fixed() from stringr() I can split the string according to a certain character. For example:

test.1 = str_split_fixed(test, c("•"), 2)
> test.1
     [,1]           [,2]
[1,] "3 CH "        " P"
[2,] "9 CH "        " P"
[3,] "2 CH "        " P"
[4,] "2 CH, 5 ECH " " V"
[5,] "3 ECH "       " V"
[6,] "4 ECH "       " P"

However, I wonder whether it is possible to set more than one character (say for example, "•" and ",") to split the string?


Solution

  • You could try using gsub to get rid of the 's:

    test <- c("3 CH • P" ,"9 CH • P" , 
              "2 CH • P" , "2 CH, 5 ECH • V",                 
              "3 ECH • V",  "4 ECH • P" )
    
    test_sub <- gsub("•", ",", test)
    
    str_split_fixed(test_sub, "[, ]+", n = 5)
    
    #or, use this version with an unfixed length:
    strsplit(test_sub, "[, ]+")
    

    This thread on string splitting may or may not be useful.