I'm writing a task manager for Azure Batch in Python. When I run the manager, and add a Job to the specified Azure Batch account, I do:
Unfortunately I fail between step 2 and 3. This is because, even if I issue the deletion command for the specified job and check that there is no job with the same id in the Azure Batch Account, I get a BatchErrorException like the following when I try to create the job again:
Exception encountered:
The specified job has been marked for deletion and is being garbage collected.
The code I use to delete the job is the following:
def deleteJob(self, jobId):
print("Delete job [{}]".format(jobId))
self.__batchClient.job.delete(jobId)
# Wait until the job is deleted
# 10 minutes timeout for the operation to succeed
timeout = datetime.timedelta(minutes=10)
timeout_expiration = datetime.datetime.now() + timeout
while True:
try:
# As long as we can retrieve data related to the job, it means it is still deleting
self.__batchClient.job.get(jobId)
except batchmodels.BatchErrorException:
print("Job {jobId} deleted correctly.".format(
jobId = jobId
))
break
time.sleep(2)
if datetime.datetime.now() > timeout_expiration:
raise RuntimeError("ERROR: couldn't delete job [{jobId}] within timeout period of {timeout}.".format(
jobId = jobId
, timeout = timeout
))
I tried to check the Azure SDK, but couldn't find a method that would tell me exactly when a job was completely deleted.
Querying for existence of the job is the only way to determine if a job has been deleted from the system.
Alternatively, you can issue a delete job and then create a job with a different id, if you do not strictly need to reuse the same job id again. This will allow the job to delete asynchronously from your critical path.