Search code examples
utf-8nsstringasciinsdatabarcode-scanner

String from NSData fails using UTF8 but succeeds using ASCII


I am scanning some barcodes and decoding them to Swift strings. The specific scanner provides an object that holds the information I need to build an NSData:

let rawData = decodedData.getData() // UnsafeMutablePointer<UInt8>
let rawDataSize = decodedData.getDataSize() // UInt32
let data = NSData(bytes: rawData, length: Int(rawDataSize)) // NSData

I then decode this into a string:

let string = NSString(data: data, encoding: NSUTF8StringEncoding) as? String

I find that certain barcodes return nil when decoding unless I switch to NSASCIIStringEncoding:

let string = NSString(data: data, encoding: NSASCIIStringEncoding) as? String

My understanding of string encoding is limited, but I was under the impression that any ASCII string could be decoded as UTF8 since ASCII is a subset of UTF8. Is this accurate?

If so, what else might be causing this issue?


Solution

  • The problem is that not every sequence of bytes is valid if interpreted as UTF-8. For example, a single byte with a value of 0xff = 255 is never valid in UTF-8. On the other hand, it might be that the ASCII encoding allows every byte value, even though this is not really correct.

    You better have a good look at the data and see what encoding it actually is. And if it is just random bytes, then please do NOT convert them to a string.