Search code examples
laravel-8ocrgoogle-cloud-vision

extract text from documents like PAN and Aadhaar


I am using cloud google vision API to extract text from Aadhaar and PAN. How can I get exact user details like name, father's name, and address?

Raw Data

ଭାରତ ସରକାର
Government of India
ଜିତ୍ୟାନନ୍ଦ ଖେମୁକୁ
NITYANANDA KHEMUDU
ପିତା : ସୀତାରାମ ଖେମୁକୁ
Father: Sitaram Khemudu
ଜନ୍ମ ତାରିଖ / DOB : 01.07.1999
ପୁରୁଷ / Male
ମୋ ଆଧାର, ମୋ ପରିଚୟ


Solution

  • I have built 5-6 OCR till date like aadhar, pan, ITR, Driving Linces etc., using google cloud vision API, I think you are looking for response like

    {"pan_card_no":"ECXXXXXX123", "name":"fshksj" }

    to get such response you need to built your own logic, here are some logic's i can share with you

    1. Perform OCR on your document using Google_cloud_vision API and store that response into one array (Goggle gives logic line by line)
    2. Like in above case if you want to grab DOB first you can build logic like i) if "DOB" in (list of item) then grab the numeric values
    3. To get the name what you can do is dropping the unnecessary items from list by if using if condition like (if "India" in i) or (if i.isdigit()) then drop it likewise you can drop the unnesseary items from main list to get the Name
    4. to grab the Address what you can do is, 95% of the time address come with pincode at last, so what you can do is treat pincode as a last index of address and look of "Address" kind of keyword then add all the elements from "Add keyword index" to "pincode index" ( this can be easily done in list) to validate whether the pincode is valid or not you can use library like Pyzipin

    There are multiple conditions that you can use, above are the very basic one i mentioned, if you need any specific logic then then you can ask me