Search code examples
pythonairflowgcloudgoogle-cloud-composer

Problem to create connection in airflow with gcloud


I have problem to create a connection in airflow composer usign gcloud command. The problem is when tried to pass the value to extra__google_cloud_platform__keyfile_dict to --conn_extra. The value to private file is wrong to access from dag. This connection is type google_cloud_platform. The values for command example is:

command:

gcloud composer environments run COMPOSER --location LOCATION connections -- --add --conn_id=CONNECTION --conn_type=google_cloud_platform --conn_extra="{\"extra__google_cloud_platform__keyfile_dict\":{\"type\":\"service_account\",\"project_id\":\"PROJECT_ID\",\"private_key_id\":\"-----BEGIN PRIVATE KEY-----\\VALUE\\n-----END PRIVATE KEY-----\\n\"}",\"extra__google_cloud_platform__project\":\"PROJECT_ID\",\"extra__google_cloud_platform__scope\":\"SCOPE\"}"

Error dag:

[2021-04-05 17:53:46,046] {base_task_runner.py:113} INFO - Job 12399: Subtask Initial_query   File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_api_base_hook.py", line 216, in _authorize
[2021-04-05 17:53:46,046] {base_task_runner.py:113} INFO - Job 12399: Subtask Initial_query     credentials = self._get_credentials()
[2021-04-05 17:53:46,047] {base_task_runner.py:113} INFO - Job 12399: Subtask Initial_query   File "/usr/local/lib/airflow/airflow/contrib/hooks/gcp_api_base_hook.py", line 164, in _get_credentials
[2021-04-05 17:53:46,047] {base_task_runner.py:113} INFO - Job 12399: Subtask Initial_query     keyfile_dict = json.loads(keyfile_dict)
[2021-04-05 17:53:46,047] {base_task_runner.py:113} INFO - Job 12399: Subtask Initial_query   File "/opt/python3.6/lib/python3.6/json/__init__.py", line 348, in loads
[2021-04-05 17:53:46,048] {base_task_runner.py:113} INFO - Job 12399: Subtask Initial_query     'not {!r}'.format(s.__class__.__name__))
[2021-04-05 17:53:46,048] {base_task_runner.py:113} INFO - Job 12399: Subtask Initial_query TypeError: the JSON object must be str, bytes or bytearray, not 'dict

I think that I have a problem with scape quotes (' or "). I tested in differents ways but still doesn't work. Finally I can see in this source gcp_api_base_hook. That keyfile_dict is used to

keyfile_dict = json.loads(keyfile_dict)

I don't know what is wrong. I hope you can help me. Thanks


Solution

  • Solution tested:

    gcloud composer environments run [COMPOSER] \
    --project [PROJECT_ID] \
    --location [LOCATION] connections -- --add --conn_id=[CONN_ID] \
    --conn_type=google_cloud_platform \
    --conn_extra='{"extra__google_cloud_platform__project":"[PROJECT_ID]","extra__google_cloud_platform__keyfile_dict": "{\"type\":\"service_account\",\"project_id\":\"[PROJECT_ID]\",\"private_key_id\": \"[PRIVATE_KEY_ID]\",
      \"private_key\": \"-----BEGIN PRIVATE KEY-----\\nXXX\\nXXX\\nXXX\\n-----END PRIVATE KEY-----\\n\",
      \"client_email\":\"[CLIENT_EMAIL]\",\"client_id\":\"[CLIENT_ID]\",\"auth_uri\": \"[AUTH_URI]\",\"token_uri\":\"[TOKEN_URL]\",\"auth_provider_x509_cert_url\":\"[CERT_URL]\",\"client_x509_cert_url\": \"[x509_CERT_URL]\"}","extra__google_cloud_platform__scope":"[SCOPE]"}'
    

    It is important to indicate that in the private key the "\n" must be changed to "\n". The airflow class overrides these values by internal definition.