If I have 2 strings of the same text, one UTF-8, and the other UTF-16.
Is it safe to assume the UTF-8 string will always be smaller, or the same size, as the UTF-16 one? (byte wise)
No, while the UTF-8 text will usually be shorter, it's not always the case.
Anything between U+0000 and U+FFFF will be represented with 2 bytes (one UTF-16 codepoint) in UTF-16.
Characters between U+0800 and U+FFFF will be represented with 3 bytes in UTF-8.
Therefore a text that contains only (or mostly) characters in that range, can easily be longer when represented in UTF-8 than in UTF-16.
Put differently:
Note that 5 and 6 byte sequences used to be defined in UTF-8 but are no longer valid according to the newest standard and were never necessary to represent Unicode codepoints.