I'm aiming to use Imagen in QnA mode from a non interactive back-end.
The documentation (https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/imagetext-vqa?project=gdg-demos&cloudshell=true) fills in a Bearer Token using the gcloud auth print-access-token
command. If I execute that in the cloud shell I get a token, but that won't be usable in a non interactive back-end.
base64_string = base64_bytes.decode(ENCODING)
VQA_PROMPT = "Describe the content of the image in great detail"
payload = {
"instances": [
{
"prompt": VQA_PROMPT,
"image": {
"bytesBase64Encoded": base64_string
}
}
],
"parameters": parameters
}
url = "https://us-central1-aiplatform.googleapis.com/v1/projects/gdg-demos/locations/us-central1/publishers/google/models/imagetext:predict"
headers = {
"Authorization": "Bearer {}".format(bearer_token),
"Accept": "application/json; charset=utf-8",
}
json_data = requests.post(url, headers=headers, json=payload)
I'm getting a 401 HTTP status code response:
b'{
"error": {
"code": 401,
"message": "Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.",
"status": "UNAUTHENTICATED",
"details": [
{
"@type": "type.googleapis.com/google.rpc.ErrorInfo",
"reason": "ACCESS_TOKEN_TYPE_UNSUPPORTED",
"metadata": {
"service": "aiplatform.googleapis.com",
"method": "google.cloud.aiplatform.v1.PredictionService.Predict"
}
}
]
}
}'
gcloud auth login --brief --quiet
REFRESH_TOKEN=$(gcloud auth print-access-token)
gcloud auth activate-refresh-token $REFRESH_TOKEN
I opened a terminal with the JupyterLab I'm tinkering with. I was able to activate a refresh token, and got the Activated refresh token credentials: [***]
after the third step. Then I tried to use that token as the Bearer token, but I got back a 403 HTTP status code with Forbidden
.
Same if I perform a regular (non brief and non quiet) gcloud auth print-access-token
in that terminal and tried that token too, but got a 403 as well.
Kudos to Anish Nangia of Google pointing out that I was looking at the wrong code. The OAuth code in my question won't work. Here is the code I should use: https://cloud.google.com/vertex-ai/docs/generative-ai/image/visual-question-answering#-python
Note, that when experimenting in my local Conda Jupyter Notebooks (https://github.com/CsabaConsulting/NextGenAI/blob/main/ImagenTest.ipynb) I'd still need to deal with ADC (Application Default Credentials), see https://cloud.google.com/docs/authentication#auth-decision-tree and https://cloud.google.com/docs/authentication/application-default-credentials
Then you'll get a Your application is authenticating by using local Application Default Credentials. The aiplatform.googleapis.com API requires a quota project, which is not set by default. To learn how to set your quota project...
, so there are interesting hoops, but those can be tackled.
When deployed in a Cloud Function you want to establish a right service account. Example code: https://github.com/CsabaConsulting/NextGenAI/tree/main/imagen_test
requirements.txt:
functions-framework==3.*
google-cloud-aiplatform==1.35.*
main.py:
import base64
import functions_framework
import vertexai
from flask import jsonify
from vertexai.vision_models import ImageQnAModel, ImageTextModel, Image
PROJECT_ID = "gdg-demos"
LOCATION = "us-central1"
@functions_framework.http
def imagen_test(request):
"""HTTP Cloud Function.
Args:
request (flask.Request): The request object.
<https://flask.palletsprojects.com/en/1.1.x/api/#incoming-request-data>
Returns:
The response text, or any set of values that can be turned into a
Response object using `make_response`
<https://flask.palletsprojects.com/en/1.1.x/api/#flask.make_response>.
"""
request_json = request.get_json(silent=True)
request_args = request.args
if request_json and 'image' in request_json:
image_b64 = request_json['image']
elif request_args and 'image' in request_args:
image_b64 = request_args['image']
else:
image_b64 = None
if not image_b64:
return jsonify(dict(data=[]))
vertexai.init(project=PROJECT_ID, location=LOCATION)
model = ImageQnAModel.from_pretrained("imagetext@001")
image_binary = base64.b64decode(image_b64)
image = Image(image_binary)
answers = model.ask_question(
image=image,
question="Describe what is on the photo in great detail, be very verbose",
number_of_results=3,
)
return jsonify(dict(data=answers))