Search code examples
google-cloud-platformcloud-document-ai

Google Document AI with gs:// URI results in 3 INVALID_ARGUMENT: Document bytes or path is required


I am facing 3 INVALID_ARGUMENT: Document bytes or path is required error while using Google Cloud Document AI with gs://file URI.

Minimal implementation on node.js 12

const documentai = require('@google-cloud/documentai').v1;
function processingDocument(params) {
    return new Promise((resolve, reject) => {
        let options = {
            credentials: {
                client_email: client_email,
                private_key: private_key,
            },
            projectId: project_id
        };
        const client = new documentai.DocumentProcessorServiceClient(options);
        client.processDocument(params, function(error, data) {
            if (error) {
                reject(error);
            };
            return resolve(true); // Testing only
        })
    })
};

params that works:

const params = {
   "name":"projects/project_name/locations/us/processors/xxxx",
   "rawDocument":{
      "mimeType":"image/png",
      "content":"iV....=" //b64 content
   }
}

params that does not work

const params = {
   "name":"projects/project_name/locations/us/processors/xxxx",
   "inlineDocument":{
      "mimeType":"image/png",
      "uri":"gs://bucket_name/demo-assets/file.png"
   }
}

I thought about a permission error, I checked whether Document AI required explicit permission to Google Storage, apparently not.

Tried also with a more elaborated payload

const params = {
   "name":"projects/project_name/locations/us/processors/xxxx",
   "inlineDocument":{
      "mimeType":"image/png",
      "textStyles":[
         
      ],
      "pages":[
         
      ],
      "entities":[
         
      ],
      "entityRelations":[
         
      ],
      "textChanges":[
         
      ],
      "revisions":[
         
      ],
      "uri":"gs://bucket_name/demo-assets/file.png"
   }
}

Unfortunately, I am stuck. Any idea what is happening?


Solution

  • The uri field cannot currently be used for processing a document.

    You are currently using Online Processing, which only supports local files.

    If you want to process documents stored in Google Cloud Storage, you will need to use Batch Processing following the examples provided on this page.

    https://cloud.google.com/document-ai/docs/send-request#batch-process

    The GCS Input URI must be provided in the BatchDocumentsInputConfig gcsPrefix or gcsDocuments field.