I'm trying to figure out how to add a security layer to my Dask Cluster deployed using helm on GKE on GCP, that would force a user to input the certificate and key files into the Security Object, as explained in this documentation [1]. Unfortunately, I get a timeout error from the scheduler pod crashing. Upon investigating the logs, the error is as follows:
Traceback (most recent call last):
File "/opt/conda/bin/dask-scheduler", line 10, in <module>
sys.exit(go())
File "/opt/conda/lib/python3.7/site-packages/distributed/cli/dask_scheduler.py", line 226, in go
main()
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/distributed/cli/dask_scheduler.py", line 206, in main
**kwargs
File "/opt/conda/lib/python3.7/site-packages/distributed/scheduler.py", line 1143, in __init__
self.connection_args = self.security.get_connection_args("scheduler")
File "/opt/conda/lib/python3.7/site-packages/distributed/security.py", line 224, in get_connection_args
"ssl_context": self._get_tls_context(tls, ssl.Purpose.SERVER_AUTH),
File "/opt/conda/lib/python3.7/site-packages/distributed/security.py", line 187, in _get_tls_context
ctx = ssl.create_default_context(purpose=purpose, cafile=ca)
File "/opt/conda/lib/python3.7/ssl.py", line 584, in create_default_context
context.load_verify_locations(cafile, capath, cadata)
FileNotFoundError: [Errno 2] No such file or directory
Helm Config Yaml File is as follows:
scheduler:
allowed-failures: 5
env:
- name: DASK_DISTRIBUTED__COMM__DEFAULT_SCHEME
value: "tls"
- name: DASK_DISTRIBUTED__COMM__REQUIRE_ENCRYPTION
value: "true"
- name: DASK_DISTRIBUTED__COMM__TLS__CA_FILE
value: "myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__SCHEDULER__KEY
value: "mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__SCHEDULER__CERT
value: "myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__WORKER__KEY
value: "mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__WORKER__CERT
value: "myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__CLIENT__KEY
value: "mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__CLIENT__CERT
value: "myca.pem"
I create the key and certificates files as follows:
openssl req -newkey rsa:4096 -nodes -sha256 -x509 -days 3650 -nodes -out myca.pem -keyout mykey.pem
Here is a minimal complete verifiable example:
import dask.dataframe as dd
from dask.distributed import Client
from distributed.security import Security
sec = Security(tls_ca_file='myca.pem',
tls_client_cert='myca.pem',
tls_client_key='mykey.pem',
require_encryption=True)
with Client("tls://<scheduler_ip>:8786", security=sec) as dask_client:
ddf = dd.read_csv('gs://<bucket_name>/my_file.csv',
engine='python',
error_bad_lines=False,
encoding="utf-8",
assume_missing=True
)
print(ddf.shape[0].compute())
I resolved the issue. Both the Dask workers and the scheduler need to have the certificate files in the config. Additionally, we need to bake in the certificates in the dockerfile as well. See full config below:
Dockerfile
FROM daskdev/dask
RUN conda install --yes \
-c conda-forge \
python==3.7
ADD certs /certs/
ENTRYPOINT ["tini", "-g", "--", "/usr/bin/prepare.sh"]
Helm Config
worker:
name: worker
image:
repository: "gcr.io/PROJECT_ID/mydask"
tag: "latest"
env:
- name: DASK_DISTRIBUTED__COMM__DEFAULT_SCHEME
value: "tls"
- name: DASK_DISTRIBUTED__COMM__REQUIRE_ENCRYPTION
value: "true"
- name: DASK_DISTRIBUTED__COMM__TLS__CA_FILE
value: "certs/myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__SCHEDULER__KEY
value: "certs/mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__SCHEDULER__CERT
value: "certs/myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__WORKER__KEY
value: "certs/mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__WORKER__CERT
value: "certs/myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__CLIENT__KEY
value: "certs/mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__CLIENT__CERT
value: "certs/myca.pem"
scheduler:
name: scheduler
image:
repository: "gcr.io/PROJECT_ID/mydask"
tag: "latest"
env:
- name: DASK_DISTRIBUTED__COMM__DEFAULT_SCHEME
value: "tls"
- name: DASK_DISTRIBUTED__COMM__REQUIRE_ENCRYPTION
value: "true"
- name: DASK_DISTRIBUTED__COMM__TLS__CA_FILE
value: "certs/myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__SCHEDULER__KEY
value: "certs/mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__SCHEDULER__CERT
value: "certs/myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__WORKER__KEY
value: "certs/mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__WORKER__CERT
value: "certs/myca.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__CLIENT__KEY
value: "certs/mykey.pem"
- name: DASK_DISTRIBUTED__COMM__TLS__CLIENT__CERT
value: "certs/myca.pem"