Search code examples
javaandroidunicodebidi

How to detect if text contains [FSI]*[PDI]


Android Studio logcat for incoming notification message show like this []message[] .

I copy and paste to .txt file, it show FSImessagePDI .

enter image description here

What is this kind of character "FSI" & "PDI" ? And how can I detect when the text contains them?


Solution

  • These are special unicode characters used for bidirectional text and part of the group Explicit Directional Isolate Formatting Characters. They are used in your example to easily insert a text-fragment with unknown direction. This can be done by wrapping the fragment in FSI and PDI. Consult Unicode Bidirectional Algorithm for more information.

    To detect them, we need to know their unicode representation:

    • First Strong Isolate (FSI) represented by 0x2068 (UTF-16).
    • Pop Directional Isolate (PDI) represented by 0x2069 (UTF-16).

    Now we can use the regex \u2068(.*?)\u2069 to extract the wrapped content:

    String input = "Hi \u2068Bob\u2069!\nHow is \u2068Alice\u2069?";
    System.out.println(input);
    
    Pattern p = Pattern.compile("\u2068(.*?)\u2069");
    Matcher m = p.matcher(input);
    while (m.find()) {
        System.out.println(m.group(1));
    }
    

    Output:
    output