Search code examples
androidparsinggoogle-cloud-vision

How to parse name, phone number email from name card after using google cloud vision OCR? Android


I finally successfully got the name card content by using google cloud vision API (OCR). My question is, I stored all the content in a TextView, how can I get the name and phone number, and email from it? any idea to grab the most important detail I want from the String? thanks in advance.


Solution

  • I understand that you want to extract and identify certain data from a card, using Google Cloud Vision API.

    You've been able to obtain the data via OCR, but the problem is rather how to identify this data as there are an unlimited number of styles and structures for cards.

    As @Inga as mentioned in the comments, you can try with regex expressions, although this could become harder the more styles and structures that you want to consider.

    So I would also suggest you to consider using a Machine Learning approach.

    For example, take a look at this article about Parsing Structured Documents with Custom Entity Extraction. It makes use of Google Cloud Vision API to read the data, same as you; but then it uses Google Cloud Natural Language API to identify certain elements via Entity Extraction.

    Take a look at the Natural Language Entity's description to see what kind of elements you can identify with this feature, for example Names, Phone Number and Adress.

    The same way, if this feature doesn't match all the data that you need to identify, you can also consider the option of creating and training a custom AutoML Natural Language model for the specific type of data that want to extract. The article mentioned earlier also makes use of this to identify specific data from restaurant's menus.

    You might as well also consider taking a look at Google Cloud Document AI which also offers OCR features oriented to Document analysis.