When I initiate a batch prediction job on Vertex AI of google cloud, I have to specify a cloud storage bucket location. Suppose I provided the bucket location, 'my_bucket/prediction/'
, then the prediction files are stored in something like: gs://my_bucket/prediction/prediction-test_model-2022_01_17T01_46_39_898Z
, which is a subdirectory within the bucket location I provided. The prediction files are stored within that subdirectory and are named:
prediction.results-00000-of-00002
prediction.results-00001-of-00002
Is there any way to programmatically get the final export location from the batch prediction name, id or any other parameter as shown below in the details of the batch prediction job?
Not only with those parameters because and you can run the same job multiple times, new folders based on the execution date will be create, but you can get it from the API using your job id (don't forget to set the credentials by GOOGLE_APPLICATION_CREDENTIALS
if you are not running on cloud sdk):
curl -H "Authorization: Bearer "$(gcloud auth application-default print-access-token) "https://us-central1-aiplatform.googleapis.com/v1/projects/[PROJECT_NAME]/locations/us-central1/batchPredictionJobs/[JOB_ID]"
Output: (Get the value from gcsOutputDirectory
)
{
...
"gcsOutputDirectory": "gs://my_bucket/prediction/prediction-test_model-2022_01_17T01_46_39_898Z"
...
}
from google.cloud import aiplatform
#-------
def get_batch_prediction_job_sample(
project: str,
batch_prediction_job_id: str,
location: str = "us-central1",
api_endpoint: str = "us-central1-aiplatform.googleapis.com",
):
client_options = {"api_endpoint": api_endpoint}
client = aiplatform.gapic.JobServiceClient(client_options=client_options)
name = client.batch_prediction_job_path(
project=project, location=location, batch_prediction_job=batch_prediction_job_id
)
response = client.get_batch_prediction_job(name=name)
print("response:", response)
#-------
get_batch_prediction_job_sample("[PROJECT_NAME]","[JOB_ID]","us-central1","us-central1-aiplatform.googleapis.com")