Search code examples
azurelocationhighlightazure-cognitive-search

Azure Search - Highlights - Locating in image


Just looking for guidance or even a general outline on approach here.

I am using azure search to OCR a batch of pdfs. I have turned on hit highlighting and I am successfully getting results back there that I am looping through / displaying in my view for the end user. I was looking on expanding that functionality to show the pdf images with the highlighting on the images themselves like in the JFK azure example. I am not proficient in react and seem to be getting lost there.

I am assuming I need to save off the OCR images to a data store for reference using the normalized_images that are created? I do have pdfs locally I can load but assume the OCR images maybe different. Have turned on GeneratedNormalizedImagesPerPage and turned on cache which creates files in my storage account.

Then I assume I need to pull the associated image, display it, use the highlight results and pull a corresponding bounding box where the phrase was detected? Problem with that approach is that I do not see any association between the highlight hit and the location (bounding box) of the hit nor the associated image file the hit was on.

Probably way off on approach here but any guidance is appreciated.

Edit 1 I did noticed the items on this page in the JFK example: https://github.com/microsoft/AzureSearch_JFK_Files/tree/master/JfkWebApiSkills/JfkWebApiSkills Would trying to replicate the ImageStore (so those are stored in my storage account) and then the HocrGenerator (appears to handle points in a doc) into my skillset for my index be the approach?


Solution

  • There are a few steps here:

    1. you need to save the layoutText from the OCR skill somewhere the UI can access it. The JFK Files demo converts it to a HOCR (to display in the UI) and saves it in index as a field in the index so that it is retrieved in the search results. HOCR isn't necessary and you may find it more efficient to store the layout in blobs using a knowlege store object projection.

    2. save the extracted images into blob storage using a file projection into the knowledge store. Keep in mind that the images may be resized in the process and the coordinates will match the resized image saved to the store. If you want to map the coordinates to the original image see this.

    3. At search time, map the highlight to the the metadata. You will find this code in the nodejs frontend, however it may be simpler to follow in the original demo by following the code here. Essentially you just find the first occurrence of the highlighted word in the metadata, display the associated image, and calculate the bounding region of the word.