Search code examples
iosswiftarkitavcapturesessiontext-recognition

how to display text recognition bounding box on screen of a ARFrame captured image? (iOS)


I've read ARKit official tutorial RealtimeNumberReader, it uses AVCaptureSession and a specific function layerRectConverted which is only for AVCaptureSession to convert coordinates from bounding box to screen coordinate.

let rect = layer.layerRectConverted(fromMetadataOutputRect: box.applying(self.visionToAVFTransform))

Now I want to recognize text on ARFrame's capturedImage and then display the bound box on screen. Is it possible?

I know how to recognize text on a single image from official tutorial, my problem is how to convert the normalized box coordinate to viewport coordinate.

Please help and thank you very much!!!


Solution

  • Try looking at this git repo. Having messed with it myself it is not the most performant but this should give you a start.