I am currently trying to achieve to draw boxes of the text that was recognized with Firebase ML Kit on top of the image. Currently, I did not have success yet and I can't see any box at all as they are all shown offscreen. I was looking at this article for a reference: https://medium.com/swlh/how-to-draw-bounding-boxes-with-swiftui-d93d1414eb00 and also at that project: https://github.com/firebase/quickstart-ios/blob/master/mlvision/MLVisionExample/ViewController.swift
This is the view where the boxes should be shown:
struct ImageScanned: View {
var image: UIImage
@Binding var rectangles: [CGRect]
@State var viewSize: CGSize = .zero
var body: some View {
// TODO: fix scaling
ZStack {
Image(uiImage: image)
.resizable()
.scaledToFit()
.overlay(
GeometryReader { geometry in
ZStack {
ForEach(self.transformRectangles(geometry: geometry)) { rect in
Rectangle()
.path(in: CGRect(
x: rect.x,
y: rect.y,
width: rect.width,
height: rect.height))
.stroke(Color.red, lineWidth: 2.0)
}
}
}
)
}
}
private func transformRectangles(geometry: GeometryProxy) -> [DetectedRectangle] {
var rectangles: [DetectedRectangle] = []
let imageViewWidth = geometry.frame(in: .global).size.width
let imageViewHeight = geometry.frame(in: .global).size.height
let imageWidth = image.size.width
let imageHeight = image.size.height
let imageViewAspectRatio = imageViewWidth / imageViewHeight
let imageAspectRatio = imageWidth / imageHeight
let scale = (imageViewAspectRatio > imageAspectRatio)
? imageViewHeight / imageHeight : imageViewWidth / imageWidth
let scaledImageWidth = imageWidth * scale
let scaledImageHeight = imageHeight * scale
let xValue = (imageViewWidth - scaledImageWidth) / CGFloat(2.0)
let yValue = (imageViewHeight - scaledImageHeight) / CGFloat(2.0)
var transform = CGAffineTransform.identity.translatedBy(x: xValue, y: yValue)
transform = transform.scaledBy(x: scale, y: scale)
for rect in self.rectangles {
let rectangle = rect.applying(transform)
rectangles.append(DetectedRectangle(width: rectangle.width, height: rectangle.height, x: rectangle.minX, y: rectangle.minY))
}
return rectangles
}
}
struct DetectedRectangle: Identifiable {
var id = UUID()
var width: CGFloat = 0
var height: CGFloat = 0
var x: CGFloat = 0
var y: CGFloat = 0
}
This is the view where this view is nested in:
struct StartScanView: View {
@State var showCaptureImageView: Bool = false
@State var image: UIImage? = nil
@State var rectangles: [CGRect] = []
var body: some View {
ZStack {
if showCaptureImageView {
CaptureImageView(isShown: $showCaptureImageView, image: $image)
} else {
VStack {
Button(action: {
self.showCaptureImageView.toggle()
}) {
Text("Start Scanning")
}
// show here View with rectangles on top of image
if self.image != nil {
ImageScanned(image: self.image ?? UIImage(), rectangles: $rectangles)
}
Button(action: {
self.processImage()
}) {
Text("Process Image")
}
}
}
}
}
func processImage() {
let scaledImageProcessor = ScaledElementProcessor()
if image != nil {
scaledImageProcessor.process(in: image!) { text in
for block in text.blocks {
for line in block.lines {
for element in line.elements {
self.rectangles.append(element.frame)
}
}
}
}
}
}
}
The calculation of the tutorial caused the rectangles being to big and the one of the sample project them being too small. (Similar for height) Unfortunately I can't find on which size Firebase determines the element's size. This is how it looks like: Without calculating the width & height at all, the rectangles seem to have about the size they are supposed to have (not exactly), so this gives me the assumption, that ML Kit's size calculation is not done in proportion to the image.size.height/width.
This is how i changed the foreach loop
Image(uiImage: uiimage!).resizable().scaledToFit().overlay(
GeometryReader{ (geometry: GeometryProxy) in
ForEach(self.blocks , id: \.self){ (block:VisionTextBlock) in
Rectangle().path(in: block.frame.applying(self.transformMatrix(geometry: geometry, image: self.uiimage!))).stroke(Color.purple, lineWidth: 2.0)
}
}
)
Instead of passing the x, y, width and height, I am passing the return value from transformMatrix
function to the path function.
My transformMatrix
function is
private func transformMatrix(geometry:GeometryProxy, image:UIImage) -> CGAffineTransform {
let imageViewWidth = geometry.size.width
let imageViewHeight = geometry.size.height
let imageWidth = image.size.width
let imageHeight = image.size.height
let imageViewAspectRatio = imageViewWidth / imageViewHeight
let imageAspectRatio = imageWidth / imageHeight
let scale = (imageViewAspectRatio > imageAspectRatio) ?
imageViewHeight / imageHeight :
imageViewWidth / imageWidth
// Image view's `contentMode` is `scaleAspectFit`, which scales the image to fit the size of the
// image view by maintaining the aspect ratio. Multiple by `scale` to get image's original size.
let scaledImageWidth = imageWidth * scale
let scaledImageHeight = imageHeight * scale
let xValue = (imageViewWidth - scaledImageWidth) / CGFloat(2.0)
let yValue = (imageViewHeight - scaledImageHeight) / CGFloat(2.0)
var transform = CGAffineTransform.identity.translatedBy(x: xValue, y: yValue)
transform = transform.scaledBy(x: scale, y: scale)
return transform
}
}
and the output is