Search code examples
iosobjective-cunicodensstringunicode-string

Encoding NSString containing 3 byte ASCII characters to a proper NSString


A JSON request returns strings with an HTML encoded Unicode character.

It looks like this: valószínű which should be decoded to valószínű

In other words ű should be ű.

I found a description about a list of non-standard HTML characters here: http://www.starr.net/is/type/htmlcodes.html

Is there any easy way to correct this?


Solution

  • It appears that the string is partially escaped. If you encode "valószín&#369" into an NSData object using:

    NSData * data = [@"valószín&#369" dataUsingEncoding:NSUTF8StringEncoding];

    then created an attributed string using

    NSAttributedString * attrString = [[NSAttributedString alloc] initWithHTML:data documentAttributes:nil];

    the "u" will be properly converted, but the preceding marks would be mangled:

    resulting in

    valószínű

    An alternative would be to see the following post:

    iOS HTML Unicode to NSString?