Search code examples
iosswiftrangensrange

Get range from unicode character symbols. Swift


I use textView(_: shouldChangeTextIn: replacementText:) function to change the input data depending on the situation. I use range, but I can not get the Swift Range when using unicode character symbols (eg ( ͡° ͜ʖ ͡°) ). Please, tell me how it can be done?

 func textView(_ textView: UITextView, shouldChangeTextIn range: NSRange, replacementText text: String) -> Bool {

        let maxLenthNotReached = textView.text.count + (text.count - range.length) <= maxTextLength

        if maxLenthNotReached {
            guard let newRange = Range(range, in: identityString) else { return false }
            identityString = identityString.replacingCharacters(in: newRange, with: text)
        }

        return maxLenthNotReached
    }

Example project

An app crash example http://take.ms/ojIJq

Update: I changed this method but I got a crash again when deleting

"entering data" ""
"testString" "༼ つ ͡° ͜ʖ🇺🇸 ͡° ༽つ( ͡° ͜ʖ🏎 ͡"
"entering data" ""

func textView(_ textView: UITextView, shouldChangeTextIn range: NSRange, replacementText text: String) -> Bool {
    debugPrint("textView.text", textView.text)
    testString = textView.text.replacingCharacters(in: Range(range, in: textView.text)!, with: text)//
    debugPrint("testString", testString)
    return true
}

Update 1: I entered these characters in the textView

( ͡° ͜ʖ ͡🇺🇸°)༼ つ ͡° ͜ʖ ͡🏎° ༽つ

Then I started to delete the characters with the right to the left after the three right few symbols were deleted ° ༽つ, and the 🏎 car emoji has left, then I can not get the range, since I put the guard and application doesn't crash, if I remove that of course there will be app crash.

Full code

class ViewController: UIViewController {

    // MARK: - IBOutlets
    @IBOutlet private weak var textView: UITextView! {
        didSet {
            textView.delegate = self
            textView.text = "( ͡° ͜ʖ ͡🇺🇸°)༼ つ ͡° ͜ʖ ͡🏎° ༽つ"
        }
    }

    // MARK: - Properties

    private var testString = ""

}


extension ViewController: UITextViewDelegate {

    func textView(_ textView: UITextView, shouldChangeTextIn range: NSRange, replacementText text: String) -> Bool {
        guard let newRange = Range(range, in: textView.text) else {
            return false
        }
        testString = textView.text.replacingCharacters(in: newRange, with: text)
        return true
    }

}

Update 2: After talking with Martin, I found and provided one detail that this problem only happens with the Google keyboard, and with the default keyboard everything works as expected.

The original line I had was "( ͡° ͜ʖ ͡🇺🇸°)༼ つ ͡° ͜ʖ ͡🏎° ༽つ”, this line is used for an example.If I start deleting this line from left to right, I get the app crash, Martin asked to show the latest data in the console before the app crashes, last print before crash is textView" "( ͡° ͜ʖ ͡🇺🇸°)༼ つ ͡° ͜ʖ ͡🏎" "range" {27, 1}


Solution

  • As it turned out in the discussion:

    • OP is using the Google keyboard,
    • the text view delegate method is called with

      textView.text = "( ͡° ͜ʖ ͡🇺🇸°)༼ つ ͡° ͜ʖ ͡🏎"
      range = { 27, 1 }
      
    • and then

      let newRange = Range(range, in: textView.text)
      

      returns nil.

    The reason is that the range points into the “middle” of the 🏎 character, which is stored as a UTF-16 surrogate pair. Here is a simplified self-contained example:

    let text = "Hello 🏎!"
    let range = NSRange(location: 7, length: 1)
    let newRange = Range(range, in: text)
    print(newRange as Any) // nil   😭😭
    

    This looks like a bug (in the Google keyboard?) to me, but there is a possible workaround.

    The “trick” is to determine the closest surrounding range of “composed character sequences,” and here is how that can be done (compare From any UTF-16 offset, find the corresponding String.Index that lies on a Character boundary):

    extension String {
        func safeRange(from nsRange: NSRange) -> Range<String.Index>? {
            guard nsRange.location >= 0 && nsRange.location <= utf16.count else { return nil }
            guard nsRange.length >= 0 && nsRange.location + nsRange.length <= utf16.count else { return nil }
            let from = String.Index(encodedOffset: nsRange.location)
            let to = String.Index(encodedOffset: nsRange.location + nsRange.length)
            return rangeOfComposedCharacterSequences(for: from..<to)
        }
    }
    

    Now

    let newRange = textView.text.safeRange(from: range)
    

    returns a String range that enclosed the entire 🏎 character. In our simplified example:

    let text = "Hello 🏎!"
    let range = NSRange(location: 7, length: 1)
    let newRange = text.safeRange(from: range)
    print(newRange as Any) // Optional(Range(...))   😀😀
    print(text.replacingCharacters(in: newRange!, with: "🚂")) // Hello 🚂!