Search code examples
swiftunicodecharacter-encodingswift-string

Why is unicode included as an Encoding in Swift's String API?


I read this very important blog regarding string encodings.

After reading it I realized that unicode is a standard of mapping characters to code points which are integers. How these integers are stored in memory is an entirely different concept. This is where .utf8, .utf16 come into play, defining the way we store these integers in memory.

In the Swift String API there is a method which gives us the data bytes used to represent the String in various encodings:

func data(using encoding: String.Encoding, allowLossyConversion: Bool = false) -> Data?

The first parameter to this method is of Type String.Encoding. This struct Encoding has an encoding declared as:

static let unicode: String.Encoding

Now suddenly the method can give me data representation of the String using the encoding .unicode

Now this is in fact opposite to what I concluded after reading the mentioned blog. Its giving me data representation of a string, even thought unicode does not provide me details of how it can be stored.

Can any one tell me what am I missing here? I am really confused now.


Solution

  • String.Encoding.unicode is the same as String.Encoding.utf16

    print(String.Encoding.unicode)
    print(String.Encoding.utf16)
    

    The above prints:

    • Unicode (UTF-16)
    • Unicode (UTF-16)