Search code examples
iosnsstringnsdatansjsonserializationemoji

NSJSONSerialization and Emoji


I'm currently trying to POST some JSON containing emojis to a python API. I tried feeding the NSJSONSerialization directly with the string containing the emojis from my UITextField but the serializer crashed with no meaningful explanation. Afterwards I tried to do some format conversion and ended up with something like this:

NSString *uniText = mytextField.text;
NSData *msgData = [uniText dataUsingEncoding:NSNonLossyASCIIStringEncoding];
NSString *goodMsg = [[NSString alloc] initWithData:msgData encoding:NSUTF8StringEncoding] ;

This basically works except that the resulting UTF-8 is kinda double-"escaped" resulting in the following:

"title":"\\ud83d\\udc8f\\ud83d\\udc8f\\ud83d\\udc8f\\ud83d"

Any suggestions how to fix that?


Solution

  • There are two difficulties:
    1. Apple hosed NSString WRT UTF Planes 1 and above, the underlying use of UTF-16 shows through. An example is that length will return 2 for one emoji character.
    2. Whoever decided to put emoji in Plane 1 was just being difficult, it is the first use of Plane 1 and a lot of legacy UTF code does not handle that correctly.

    Example code (adapted from @Hot Licks): Updated with OP emoji

    NSString *uniText = @"💦💏👒👒💦";
    NSDictionary* jsonDict = @{@"title":uniText};
    
    NSData * utf32Data = [uniText dataUsingEncoding:NSUTF32LittleEndianStringEncoding];
    NSLog(@"utf32Data: %@", utf32Data);
    
    NSError* error = nil;
    NSData* jsonData = [NSJSONSerialization dataWithJSONObject:jsonDict options:0 error:&error];
    if (jsonData == nil) {
        NSLog(@"JSON serialization error: %@", error);
    }
    else {
        NSString* jsonString = [[NSString alloc] initWithData:jsonData encoding:NSUTF8StringEncoding];
        NSLog(@"The JSON result is %@", jsonString);
        NSLog(@"jsonData: %@", jsonData);
    }
    

    NSLog output

    utf32Data: a6f40100 8ff40100 52f40100 52f40100 a6f40100
    The JSON result is {"title":"💦💏👒👒💦"}
    jsonData: 7b227469 746c6522 3a22f09f 92a6f09f 928ff09f 9192f09f 9192f09f 92a6227d