Search code examples
unicodedartfluttericugrapheme-cluster

Handling grapheme clusters in Dart


From what I can tell Dart does not have support for grapheme clusters, though there is talk of supporting it:

Until it is implemented, what are my options for iterating through grapheme clusters? For example, if I have a string like this:

String family = '\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}'; // 👨‍👩‍👧
String myString = 'Let me introduce my $family to you.';

and there is a cursor after the five-codepoint family emoji:

enter image description here

How would I move the cursor one user-perceived character to the left?

(In this particular case I know the size of the grapheme cluster so I could do it, but what I am really asking about is finding the length of an arbitrarily long grapheme cluster.)

Update

I see from this article that Swift uses the system's ICU library. Something similar may be possible in Flutter.

Supplemental code

For those who want to play around with my example above, here is a demo project. The buttons move the cursor to the right or left. It currently takes 8 button presses to move the cursor past the family emoji.

enter image description here

main.dart

import 'package:flutter/material.dart';

void main() => runApp(MyApp());

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(title: Text('Grapheme cluster testing')),
        body: BodyWidget(),
      ),
    );
  }
}

class BodyWidget extends StatefulWidget {
  @override
  _BodyWidgetState createState() => _BodyWidgetState();
}

class _BodyWidgetState extends State<BodyWidget> {

  TextEditingController controller = TextEditingController(
      text: 'Let me introduce my \u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467} to you.'
  );

  @override
  Widget build(BuildContext context) {
    return Column(
      children: <Widget>[
        TextField(
          controller: controller,
        ),
        Row(
          children: <Widget>[
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('<<'),
                onPressed: () {
                  _moveCursorLeft();
                },
              ),
            ),
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('>>'),
                onPressed: () {
                  _moveCursorRight();
                },
              ),
            ),
          ],
        )
      ],
    );
  }

  void _moveCursorLeft() {
    int currentCursorPosition = controller.selection.start;
    if (currentCursorPosition == 0)
      return;
    int newPosition = currentCursorPosition - 1;
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }

  void _moveCursorRight() {
    int currentCursorPosition = controller.selection.end;
    if (currentCursorPosition == controller.text.length)
      return;
    int newPosition = currentCursorPosition + 1;
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }
}

Solution

  • Update: use https://pub.dartlang.org/packages/icu

    Sample code:

    import 'package:flutter/material.dart';
    
    
    import 'dart:async';
    import 'package:icu/icu.dart';
    
    void main() => runApp(MyApp());
    
    class MyApp extends StatelessWidget {
      @override
      Widget build(BuildContext context) {
        return MaterialApp(
          home: Scaffold(
            appBar: AppBar(title: Text('Grapheme cluster testing')),
            body: BodyWidget(),
          ),
        );
      }
    }
    
    class BodyWidget extends StatefulWidget {
      @override
      _BodyWidgetState createState() => _BodyWidgetState();
    }
    
    class _BodyWidgetState extends State<BodyWidget> {
      final ICUString icuText = ICUString('Let me introduce my \u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467} to you.\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}');
      TextEditingController controller;
      _BodyWidgetState() {
        controller = TextEditingController(
          text: icuText.toString()
      );
      }
    
      @override
      Widget build(BuildContext context) {
        return Column(
          children: <Widget>[
            TextField(
              controller: controller,
            ),
            Row(
              children: <Widget>[
                Padding(
                  padding: const EdgeInsets.all(8.0),
                  child: RaisedButton(
                    child: Text('<<'),
                    onPressed: () async {
                      await _moveCursorLeft();
                    },
                  ),
                ),
                Padding(
                  padding: const EdgeInsets.all(8.0),
                  child: RaisedButton(
                    child: Text('>>'),
                    onPressed: () async {
                      await _moveCursorRight();
                    },
                  ),
                ),
              ],
            )
          ],
        );
      }
    
      void _moveCursorLeft() async {
        int currentCursorPosition = controller.selection.start;
        if (currentCursorPosition == 0)
          return;
        int newPosition = await icuText.previousGraphemePosition(currentCursorPosition);
        controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
      }
    
      void _moveCursorRight() async {
        int currentCursorPosition = controller.selection.end;
        if (currentCursorPosition == controller.text.length)
          return;
        int newPosition = await icuText.nextGraphemePosition(currentCursorPosition);
        controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
      }
    }
    
    

    Original answer:

    Until Dart/Flutter fully implements ICU, I think your best bet is to use PlatformChannel to pass the Unicode string native (iOS Swift4+ or Android Java/Kotlin) to iterate/manupuliate there, and send back the result.

    • For Swift4+, it's out-of-the-box as the article you mention (not Swift3-, not ObjC)
    • For Java/Kotlin, replace Oracle's BreakIterator with ICU library's, which works much better. No changes aside from import statements.

    The reason I suggest to use native manipulation (instead of doing it on Dart) is because Unicode has too many things to handle, such as normalization, canonical equivalence, ZWNJ, ZWJ, ZWSP, etc.

    Comment down if you need some sample code.