Vision language detection

I'm using Vision provided by Apple to convert some images into text. It's working well, but the problem I currently have is with Chinese characters.

I'm doing this currently:

let request = VNRecognizeTextRequest(completionHandler: recognizeTextHandler)
request.recognitionLevel = .accurate
request.recognitionLanguages = try! VNRecognizeTextRequest.supportedRecognitionLanguages(for: .accurate,
revision: request.revision)

And it looks like it supports a bunch of latin languages along with Chinese.

Vision seems to be able to detect languages such as German just fine automatically, but I have to specify Chinese at the front of the recognitionLanguages property for it to work with Chinese.

Is there any way to automatically detect the language of the image?

Solution

I have to specify Chinese at the front of the recognitionLanguages property for it to work with Chinese

This is how it was designed. .accurate uses a ML-based recognizer, and because Chinese is really complex, it must come first. See WWDC21's Extract document data using Vision at 8:02.

This also means that there is no way to automatically detect the language of the image.