Context: Using Terraform I created an EKS cluster on AWS. On that cluster I installed Nginx Ingress using Helm 3. TLS is performed using Let's Encrypt with cert-manager. Subsequently I can add web exposed applications using deployment, services and ingress yaml files.
Problem: Something that does not work for me is deploying JupyterHub successfully. Installation and exposure work fine, with JupyterHub using the TCP protocol and cert-manager creating the certificates successfully. The problem starts when a user logs in successfully into jupyterhub but a invalid or expired cookie token
occurs when jupyterhub is supposed to spawn a notebook.
Question: It is unclear to me why the spawning does not work and how this can be resolved. Does anyone have a suggestion to better understand the issue?
The jupyterhub_config.py
is as follows:
c = get_config()
c.JupyterHub.authenticator_class = 'jupyterhub.auth.DummyAuthenticator'
c.Authenticator.allowed_users = {'dummy'}
c.Authenticator.admin_users = {'dummy'}
c.DummyAuthenticator.password = "fakenews"
c.JupyterHub.admin_access = True
The deployment.yaml
is as follows:
---
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
generation: 1
labels:
run: jupyterhub
name: jupyterhub
namespace: jhub
spec:
progressDeadlineSeconds: 600
replicas: 2
revisionHistoryLimit: 2
selector:
matchLabels:
run: jupyterhub
template:
metadata:
creationTimestamp: ~
labels:
run: jupyterhub
spec:
containers:
- name: jupyterhub
image: "jupyterhub/jupyterhub:latest"
imagePullPolicy: IfNotPresent
ports:
-
containerPort: 8000
protocol: TCP
terminationMessagePolicy: File
volumeMounts:
-
mountPath: /srv/jupyterhub/jupyterhub_config.py
name: jupyterhub-config
subPath: jupyterhub_config.py
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
-
configMap:
name: jupyterhub-config
name: jupyterhub-config
The ingress.yaml
is as follows:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ingress-resource
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/ssl-redirect: "false"
spec:
tls:
- hosts:
- hub.example.com
secretName: hub-example-com-tls
rules:
- host: hub.example.com
http:
paths:
- path: /
backend:
serviceName: jupyterhub
servicePort: 8000
The commands used:
$ kubectl create configmap jupyterhub-config --from-file=./jupyterhub_config.py
$ kubectl create -f deployment.yaml
$ kubectl expose deployment jupyterhub
$ kubectl apply -f ingress.yaml
This results in a successful secure deployment web service on https://hub.example.com
. But after logging in, the jupyterhub container log gives an invalid or expired cookie token
when trying to spawn a jupyter instance.
[I 2020-08-21 08:26:42.725 JupyterHub app:2307] Running JupyterHub version 1.2.0dev
[I 2020-08-21 08:26:42.726 JupyterHub app:2338] Using Authenticator: jupyterhub.auth.DummyAuthenticator-1.2.0dev
[I 2020-08-21 08:26:42.726 JupyterHub app:2338] Using Spawner: jupyterhub.spawner.LocalProcessSpawner-1.2.0dev
[I 2020-08-21 08:26:42.726 JupyterHub app:2338] Using Proxy: jupyterhub.proxy.ConfigurableHTTPProxy-1.2.0dev
[I 2020-08-21 08:26:42.735 JupyterHub app:1442] Writing cookie_secret to /srv/jupyterhub/jupyterhub_cookie_secret
[I 2020-08-21 08:26:42.752 alembic.runtime.migration migration:155] Context impl SQLiteImpl.
[I 2020-08-21 08:26:42.752 alembic.runtime.migration migration:162] Will assume non-transactional DDL.
[I 2020-08-21 08:26:42.758 alembic.runtime.migration migration:515] Running stamp_revision -> 4dc2d5a8c53c
[I 2020-08-21 08:26:42.809 JupyterHub proxy:461] Generating new CONFIGPROXY_AUTH_TOKEN
[I 2020-08-21 08:26:42.850 JupyterHub app:2377] Initialized 0 spawners in 0.002 seconds
[W 2020-08-21 08:26:42.853 JupyterHub proxy:643] Running JupyterHub without SSL. I hope there is SSL termination happening somewhere else...
[I 2020-08-21 08:26:42.853 JupyterHub proxy:646] Starting proxy @ http://:8000
08:26:43.359 [ConfigProxy] info: Proxying http://*:8000 to (no default)
08:26:43.362 [ConfigProxy] info: Proxy API at http://127.0.0.1:8001/api/routes
08:26:43.474 [ConfigProxy] info: 200 GET /api/routes
[I 2020-08-21 08:26:43.475 JupyterHub app:2622] Hub API listening on http://127.0.0.1:8081/hub/
08:26:43.476 [ConfigProxy] info: 200 GET /api/routes
[I 2020-08-21 08:26:43.476 JupyterHub proxy:320] Checking routes
[I 2020-08-21 08:26:43.476 JupyterHub proxy:400] Adding default route for Hub: / => http://127.0.0.1:8081
08:26:43.478 [ConfigProxy] info: Adding route / -> http://127.0.0.1:8081
08:26:43.478 [ConfigProxy] info: Route added / -> http://127.0.0.1:8081
08:26:43.478 [ConfigProxy] info: 201 POST /api/routes/
[I 2020-08-21 08:26:43.479 JupyterHub app:2697] JupyterHub is now running at http://:8000
[I 2020-08-21 08:26:56.023 JupyterHub log:181] 302 GET /hub/ -> /hub/login (@10.0.1.148) 1.16ms
[I 2020-08-21 08:27:01.409 JupyterHub base:742] User logged in: dummy
[I 2020-08-21 08:27:01.429 JupyterHub log:181] 302 POST /hub/login?next= -> /hub/spawn (dummy@10.0.1.148) 68.74ms
[I 2020-08-21 08:27:01.758 JupyterHub log:181] 200 GET /hub/login?next=%2Fhub%2Fspawn (@10.0.1.148) 219.05ms
08:31:43.482 [ConfigProxy] info: 200 GET /api/routes
[I 2020-08-21 08:31:43.482 JupyterHub proxy:320] Checking routes
[I 2020-08-21 12:06:43.482 JupyterHub proxy:320] Checking routes
[I 2020-08-21 12:07:08.386 JupyterHub log:181] 200 GET /hub/login?next=%2Fhub%2Fspawn (@10.0.2.117) 1.85ms
[I 2020-08-21 12:07:13.216 JupyterHub base:742] User logged in: dummy
[I 2020-08-21 12:07:13.217 JupyterHub log:181] 302 POST /hub/login?next=%2Fhub%2Fspawn -> /hub/spawn (dummy@10.0.2.117) 5.40ms
[I 2020-08-21 12:07:13.309 JupyterHub log:181] 200 GET /hub/login?next=%2Fhub%2Fspawn (@10.0.2.117) 1.22ms
[I 2020-08-21 13:27:28.324 JupyterHub log:181] 302 GET / -> /hub/ (@10.0.2.117) 0.90ms
[I 2020-08-21 13:27:28.410 JupyterHub log:181] 200 GET /hub/login (@10.0.2.117) 1.28ms
[W 2020-08-21 13:27:34.613 JupyterHub base:392] Invalid or expired cookie token
[I 2020-08-21 13:27:34.615 JupyterHub log:181] 302 GET /hub/spawn -> /hub/login?next=%2Fhub%2Fspawn (@10.0.2.117) 1.88ms
As OP menitoned, scaling the deploymenty down to 1 replica solved the problem.
I would like to clarify what seems to be the issue.
Jupyterhub is not scalable. It's stateful application and it's (as of now) impossible to make it run Highly Available.
K8s Service is LoadBalancing between two pods/replicas, sending traffic randomly.
When logged in to one jupyterhub you receive a token. Now with this token you send another request. Can you guess what would happen if this request with a token you just received gets sent to the second instance of jupyterhub. The one that has no idea what this token is because its not the one the generated it.
invalid or expired cookie token
This is what you would see. The second instance would find this token invalid.
This is why scaling down to one replica solved the problem. Its because all traffic is now sent to one pod.