Search code examples
regexswiftstringsplitswift3

How to Split String Using Regex Expressions


I have a string "323 ECO Economics Course 451 ENG English Course 789 Mathematical Topography" I want to split this string using the regex expression [0-9][0-9][0-9][A-Z][A-Z][A-Z] so that the function returns the array:

Array = 
["323 ECO Economics Course ", "451 ENG English Course",  "789 Mathematical Topography"]

How would I go about doing this using swift?

Edit My question is different than the one linked to. I realize that you can split a string in swift using myString.components(separatedBy: "splitting string") The issue is that that question doesn't address how to make the splitting string a regex expression. I tried using mystring.components(separatedBy: "[0-9][0-9][0-9][A-Z][A-Z][A-Z]", options: .regularExpression) but that didn't work.

How can I make the separatedBy: portion a regular expression?


Solution

  • Swift doesn't have native regular expressions as of yet. But Foundation provides NSRegularExpression.

    import Foundation
    
    let toSearch = "323 ECO Economics Course 451 ENG English Course 789 MAT Mathematical Topography"
    
    let pattern = "[0-9]{3} [A-Z]{3}"
    let regex = try! NSRegularExpression(pattern: pattern, options: [])
    
    // NSRegularExpression works with objective-c NSString, which are utf16 encoded
    let matches = regex.matches(in: toSearch, range: NSMakeRange(0, toSearch.utf16.count))
    
    // the combination of zip, dropFirst and map to optional here is a trick
    // to be able to map on [(result1, result2), (result2, result3), (result3, nil)]
    let results = zip(matches, matches.dropFirst().map { Optional.some($0) } + [nil]).map { current, next -> String in
      let range = current.rangeAt(0)
      let start = String.UTF16Index(range.location)
      // if there's a next, use it's starting location as the ending of our match
      // otherwise, go to the end of the searched string
      let end = next.map { $0.rangeAt(0) }.map { String.UTF16Index($0.location) } ?? String.UTF16Index(toSearch.utf16.count)
    
      return String(toSearch.utf16[start..<end])!
    }
    
    dump(results)
    

    Running this will output

    ▿ 3 elements
      - "323 ECO Economics Course "
      - "451 ENG English Course "
      - "789 MAT Mathematical Topography"