I am using MLVision cloud text recognition for my app. I capture/upload a photo and then I start the process. When it recognises the image and extract the text, then I separate it and append every separated block into an array.
The code below is for the whole process.
lazy var vision = Vision.vision()
var textRecognizer: VisionTextRecognizer!
var test = [] as Array<String>
override func viewDidLoad() {
let options = VisionCloudTextRecognizerOptions()
options.languageHints = ["en","hi"]
textRecognizer = vision.cloudTextRecognizer(options: options)
//where pickedImage is the image that user captures.
let visionImage = VisionImage(image: pickedImage)
textRecognizer.process(visionImage, completion: { (features, error) in
guard error == nil, let features = features else {
self.resultView.text = "Could not recognize any text"
self.dismiss(animated: true, completion: nil)
for block in features.blocks {
for line in block.lines{
//for element in line.elements{
self.resultView.text = self.resultView.text + "\(line.text)"
func separate(){
let separators = CharacterSet(charactersIn: (":)(,•/·]["))
let ofWordsArray = self.resultView.text.components(separatedBy: separators)
for word in ofWordsArray{
let low = word.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
if low != ""{
Everything works fine and I get the result that I want.The problem is that I think is really slow. It takes about 20sec for the entire process.Is there a way to make it faster? Thanks in advance.
You are using the VisionCloudTextRecognizer. Speed will depend on your connection, in my case it was only few seconds. Your other option is to use on-device text recognition or use a hybrid approach, where you first detect on-device, then correct with Cloud API later.