Search code examples
dockerkubernetesairflowgoogle-cloud-composer

How to pass a nested dictionaries as env variables to Kubernetes Pod Operator


In dag script

PARAM = {
    'key1' : 'value1',
    'key2' : 'value2'
}

t1 = KubernetesPodOperator(
    task_id=task_id,
    name=task_name,
    cmds=["pipenv", "run", "python3", "myscript.py"],
    env_vars={
        'GCS_PROJECT': GCS_PROJECT,
        'GCS_BUCKET': GCS_BUCKET,
        'PARAM': PARAM # this line throws an error
    },
    image=docker_image
)

I failed to pass a dictionary (PARAM) in the case above. I tried to pass two lists (a list of keys and a list of values so that I can zip them later) but it didn't work too. The error message is something like this

kubernetes.client.exceptions.ApiException: (400)
Reason: Bad Request
HTTP response headers: ......
HTTP response body: {
    ......    
    "apiVersion":"v1",
    "metadata":{},
    "status":"Failure",
    "message":"Pod in version \"v1\" cannot be handled as a Pod: json: cannot be unmarshal object into Go struct field EnvVar.spec.containers.env.value of type string"
    ....
}

Is there a way in which I can pass PARAM?


Solution

  • Environment variables are strings. You cannot pass a structured variable like a dictionary in an environment variable unless you first convert it into a string (e.g, by serializing it to JSON). You could do this:

    t1 = KubernetesPodOperator(
        task_id=task_id,
        name=task_name,
        cmds=["pipenv", "run", "python3", "myscript.py"],
        env_vars={
            'GCS_PROJECT': GCS_PROJECT,
            'GCS_BUCKET': GCS_BUCKET,
            'PARAM': json.dumps(PARAM),
        },
        image=docker_image,
    )