Search code examples
cloud-document-ai

Empty Text Fields Not Being Labeled in Document AI Custom Processor HITL UI


When reviewing auto-labeled documents in the HITL or labeling training interface, I'm not clear from Google's instructions whether to identify empty fields on a form. For example, if I have these fields:

Example empty text fields

Should I leave them unlabeled since they don't have values, or draw the bounding boxes that identify where a value would be and leave the text value blank like this?

Example empty text fields labeled

I'm reviewing auto-labeled forms and seeing these blank fields unlabeled frequently, and I'm not sure if I'm helping to improve the model by drawing the bounding boxes.


Solution

  • Confirmed with the Product Managers, you should leave empty fields unlabeled when creating training data for a Custom Processor (Uptraining or Custom Document Extractor)