Search code examples
swifturlnetwork-programmingencodingfoundation

Foundation's string encoding isn't what sites are expecting


Specifically, it's encoding characters with an umlaut as two characters.

let unencoded = "könnten"
let encoded = unencoded.stringByAddingPercentEncodingWithAllowedCharacters(NSCharacterSet.URLQueryAllowedCharacterSet())!

encoded is then equal to ko%CC%88nnten. So, it's converting the ö into o%CC%88. So it's really like , where the umlaut (¨) and the o are separate.

However, most sites seem to be expecting the encoding to be %C3%B6, which is ö, where the umlaut (¨) and o are one single character.

You can see an example of the encoding not working here (how Foundation wants to encode it):

https://www.linguee.com/german-english/search?query=ko%CC%88nnten

And how it would ideally be:

https://www.linguee.com/german-english/search?query=k%C3%B6nnten

Is there a better way to be encoding this? Maybe different options or a different framework?


Solution

  • Ideally, the server should cope with both precomposed and decomposed strings. But if necessary, you can precompose the string on the client side:

    let unencoded = "könnten"
    let encoded = unencoded.precomposedStringWithCanonicalMapping
            .stringByAddingPercentEncodingWithAllowedCharacters(.URLQueryAllowedCharacterSet())!
    
    print(encoded) // k%C3%B6nnten
    

    See Technical Q&A QA1235 – Converting to Precomposed Unicode for more information.