The following takes several seconds. I know this is not conceptual but I'm cross posting from a GCP issue just in case someone happened to step on the same scenario.
const { PredictionServiceClient } = require('@google-cloud/automl');
const predictionServiceClient = new PredictionServiceClient({ apiEndpoint });
const prediction = await predictionServiceClient.predict({
name: modelFullId,
params,
payload,
});
This API in real time takes close to 10s when cold and 5s when hot. Is there a way to fasten this up other than to export the model and run it yourselves?
Yes, you can export the model and use with tensorFlowJS.
https://cloud.google.com/vision/automl/docs/edge-quickstart
https://github.com/tensorflow/tfjs/tree/master/tfjs-automl
Export the model and download model.json, dict.txt, *.bin
files to your local.
Load the model into tensorFlowJS and use it.