In databricks I've manually created a DAG job-of-jobs (task type Run job) that executes several sub-jobs. When I manually run it, it works well, and I can see it executing the sub-jobs to completion in the run.
The issue is that I want to actually execute this job from a Python notebook and I don't understand what the JSON data contents need to be to execute this successfully against the databricks API. I've tried not having settings, leaving settings empty, copying the settings JSON from the databricks DAG job itself into these settings, and then just replacing the entire data JSON with the full DAG job of job JSON text, all return the same error:
{"error_code":"INVALID_PARAMETER_VALUE","message":"Run settings must be specified."}
What does my json need to be, minimally, to be able to execute this job_id (that is, in essence, a type "Run job" databricks job)?
import requests
databricks_instance = "https://<my instance>" + ".cloud.databricks.com"
access_token = "<my access token>"
headers = {
"Authorization": f"Bearer {access_token}",
"Content-Type": "application/json"
}
data = {
"job_id": "<my job id>",
"settings": {<settings copied from job's json>}
}
response = requests.post(f"{databricks_instance}/api/2.0/jobs/runs/submit", json=data, headers=headers)
if response.status_code == 200:
print(f"Job run triggered successfully for Job ID: {<my job id>}")
else:
print(f"Failed to trigger job run for Job ID: {<my job id>}")
print(response.text)
I've tried reviewing https://docs.databricks.com/api/workspace/jobs/runnow but it did not help.
In your code you are using the /jobs/runs/submit
endpoint which is the Create and trigger a one-time run API; it creates a new adhoc job from scratch, hence the "settings" it requires is in fact the settings for a Databricks Job definition.
Since you already have the job defined and just want to trigger a run of it, you should use the Trigger a new job run API, which is /api/2.1/jobs/run-now
(you actually mention this API but it does not match the endpoint from the code). With this API, the job_id
is the only required parameter.