Search code examples
swiftregexcarriage-returnnsregularexpression

Swift regular expression - unexpected behavior with carriage return `\r`


I have a string of a public/private RSA key in Swift, from which I want to remove the comments using regular expressions. The actual key string contains special character combinations like \r\n for carriage return + new line. This is an example:

let publicKey = "-----BEGIN RSA PUBLIC KEY-----\n0123456789\r\n0123456789\r\nabcdefgh\n-----END RSA PUBLIC KEY-----"
let regex = try! NSRegularExpression(pattern: "(\n)?-* ?(BEGIN|END) ((PRIVATE RSA|PUBLIC RSA)|(RSA PRIVATE|RSA PUBLIC)|(PRIVATE|PUBLIC)) KEY ?-*(\n)?", options: NSRegularExpression.Options.caseInsensitive)
let range = NSMakeRange(0, publicKey.count)
print(regex.stringByReplacingMatches(in: publicKey, options: [], range: range, withTemplate: ""))

The printed result is

0123456789
0123456789
abcdefgh--

but should be

0123456789
0123456789
abcdefgh

But when I remove the two carriage return characters, the result is as expected, without the dashes. What is going wrong here?


Solution

  • Your regex is fine. The issue is that publicKey.count will count line endings like \r\n as one character.

    You may fix the issue by using

    let range = NSMakeRange(0, publicKey.utf16.count)
    

    Or, simply use .replacingOccurrences with .regularExpression option:

    let publicKey = "-----BEGIN RSA PUBLIC KEY-----\n0123456789\r\n0123456789\r\nabcdefgh\n-----END RSA PUBLIC KEY-----"
    let regex = "(?i)(\n)?-* ?(BEGIN|END) ((PRIVATE RSA|PUBLIC RSA)|(RSA PRIVATE|RSA PUBLIC)|(PRIVATE|PUBLIC)) KEY ?-*(\n)?"
    print( publicKey.replacingOccurrences(of: regex, with: "", options: [.regularExpression]) )
    // => 0123456789
    //    0123456789
    //    abcdefgh
    

    Just in case you want to shorten the pattern, use

    (?i)\n?-* ?(?:BEGIN|END) (?:(?:PRIVATE|PUBLIC)(?: RSA)?|RSA (?:PRIVATE|PUBLIC)) KEY ?-*\n?
    

    See the regex online demo