Search code examples
objective-cnsstringbase64xcode4.6utf8-decode

NSString to UTF8String error when contains chinese character


I tried to encrypt a data that might contains chinese character, however i kept getting null when I decrypt the string. the way I encrypt the data is derived from our android team, So I wanna keep it the same. It looks like when I call [[NSString alloc] initWithData:dataFrom64 encoding:NSUTF8StringEncoding]; It gives me a NSString representation of an UTF8String. and when I call NSString UTF8String, it returns something unexpected. I tried to print out every thing to see where goes wrong. Sorry for the mess. I really need help on this. I can't figure out how to solve it.

   NSLog(@"--------Test begins--------");
   NSString *chinese = @"abcd 測試";

   /** encrypt **/
   char const *testCStr = [testString UTF8String];
   char const *cStr = [chinese UTF8String];
   char *newCStr = (char*)calloc(sizeof(char), strlen(cStr));
   strcpy(newCStr, cStr);

   int lenStr = strlen(cStr);
   int lenKey = testString.length;

   for (int i = 0, j = 0; i < lenStr; i++, j++) {
      if (j >= lenKey) j = 0;
      newCStr[i] = cStr[i] ^ testCStr[j];
   }

   NSString *tempStr = [NSString stringWithUTF8String:[[NSString stringWithFormat:@"%s",newCStr] UTF8String]];
   NSData   *tempData = [tempStr dataUsingEncoding:NSUTF8StringEncoding];
   NSString *base64Str = [tempData base64EncodedString];
   char const *dataCStr = [tempData bytes];
   NSString* dataToStr = [[NSString alloc] initWithData:tempData
                                          encoding:NSUTF8StringEncoding];

   NSLog(@"chinese         : %@", chinese);
   NSLog(@"chinese utf8    : %s ", [chinese UTF8String]);
   NSLog(@"encrypted utf8  : %s ", newCStr);
   NSLog(@"--------Encrypt--------");
   NSLog(@"encrypted str   : %@", tempStr);
   NSLog(@"temp data bytes : %s", dataCStr);
   NSLog(@"data to str     : %@", dataToStr);
   NSLog(@"base64 data     : %@", base64Str);
   NSLog(@"data temp       : %@", tempData );

   /** decrypt**/
   NSData *dataFrom64 = [NSData dataFromBase64String:base64Str];
   NSString *strFromData = [[NSString alloc] initWithData:dataFrom64
                                             encoding:NSUTF8StringEncoding];
   char const *cStrFromData = [strFromData UTF8String];
   char *newStr2 = (char*)calloc(sizeof(char), strlen(cStrFromData));

   strcpy(newStr2, cStrFromData);

   for (int i = 0, j = 0; i < lenStr; i++, j++) {
      if (j >= lenKey) j = 0;
      newStr2[i] = cStrFromData[i] ^ testCStr[j];
   }

   NSLog(@"--------Decrypt--------");
   NSLog(@"data 64         : %@", dataFrom64 );
   NSLog(@"data 64 bytes   : %s", [dataFrom64 bytes]);
   NSLog(@"str from data   : %@", strFromData);
   NSLog(@"cStr from data  : %s", [strFromData UTF8String]);
   NSLog(@"decrypt utf8    : %s", newStr2);
   NSLog(@"decrypt str     : %@", [NSString stringWithUTF8String:newStr2]);

and here is the out put:

   --------Test begins--------
   chinese         : abcd 測試
   chinese utf8    : abcd 測試 
   encrypted utf8  : #!B5aºÄõ–ôá 
   --------Encrypt--------
   encrypted str   : #!B5aºÄõ–ôá
   temp data bytes : #!B5aºÄõ–ôá6.889 WebSocke
   data to str     : #!B5aºÄõ–ôá
   base64 data     : IyFCNWHCusOEw7XigJPDtMOh
   data temp       : <23214235 61c2bac3 84c3b5e2 8093c3b4 c3a1>
   --------Decrypt--------
   data 64         : <23214235 61c2bac3 84c3b5e2 8093c3b4 c3a1>
   data 64 bytes   : #!B5aºÄõ–ôá
   str from data   : #!B5aºÄõ–ôá
   cStr from data  : #!B5aºÄõ–ôá
   decrypt utf8    : abcd òÇÙºÛî‚Äì√¥√°
   decrypt str     : (null)
   --------test ends--------

Solution

  • The problem is that newCStr is not null-terminated, and does also not represent a valid UTF-8 string. So this conversion

    NSString *tempStr = [NSString stringWithUTF8String:[[NSString stringWithFormat:@"%s",newCStr] UTF8String]];
    

    is bound to fail (or give a wrong result).

    The following code avoids unnecessary conversions:

    NSLog(@"--------Test begins--------");
    NSString *plainText = @"abcd 測試";
    NSString *keyString = @"topsecret";
    
    /** encrypt **/
    NSMutableData *plainData = [[plainText dataUsingEncoding:NSUTF8StringEncoding] mutableCopy];
    NSData *keyData = [keyString dataUsingEncoding:NSUTF8StringEncoding];
    uint8_t *plainBytes = [plainData mutableBytes];
    const uint8_t *keyBytes = [keyData bytes];
    for (int i = 0, j = 0; i < [plainData length]; i++, j++) {
        if (j >= [keyData length]) j = 0;
        plainBytes[i] ^= keyBytes[j];
    }
    NSString *base64Str = [plainData base64EncodedString];
    
    NSLog(@"chinese         : %@", plainText);
    NSLog(@"--------Encrypt--------");
    NSLog(@"base64 data     : %@", base64Str);
    
    /** decrypt**/
    NSData *dataFrom64 = [NSData dataFromBase64String:base64Str];
    
    NSMutableData *decodeData = [dataFrom64 mutableCopy];
    uint8_t *decodeBytes = [decodeData mutableBytes];
    for (int i = 0, j = 0; i < [decodeData length]; i++, j++) {
        if (j >= [keyData length]) j = 0;
        decodeBytes[i] ^= keyBytes[j];
    }
    NSString *decrypted = [[NSString alloc] initWithData:decodeData
                                                  encoding:NSUTF8StringEncoding];
    NSLog(@"--------Decrypt--------");
    NSLog(@"decrypt str     : %@", decrypted);
    

    Output:

    --------Test begins--------
    chinese         : abcd 測試
    --------Encrypt--------
    base64 data     : FQ0TF0WFysmc3ck=
    --------Decrypt--------
    decrypt str     : abcd 測試