My colleague and I have a question on how one Kubernetes Pod can serve multiple users. Our belief is/was that one Pod serves a single end user. Here is our experiment.
We have a basic Angular Frontend talking to a .Net API Server that uses a MySQL PVClaim structure running on a Linode Kubernetes Cluster. The Cluster has 3 nodes with plenty of RAM, disk space, and CPUs. Here is a print out of 'kubectl get all':
C:\development\Kubernetes >kubectl get all
NAME READY STATUS RESTARTS AGE
pod/dataapi-server-57c87f56c6-2fcdz 1/1 Running 0 21h
pod/gems-deployment-c9f6dbdbb-2qhff 1/1 Running 0 20h
pod/mysql-6d4b9f6f46-8m66h 1/1 Running 0 21h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dataapi-server ClusterIP 10.128.6.155 <none> 80/TCP 21h
service/gems-frontend-service LoadBalancer 10.128.76.185 104.***.**.** 80:31190/TCP 20h
service/kubernetes ClusterIP 10.128.0.1 <none> 443/TCP 4d23h
service/mysql ClusterIP None <none> 3306/TCP 21h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dataapi-server 1/1 1 1 21h
deployment.apps/gems-deployment 1/1 1 1 20h
deployment.apps/mysql 1/1 1 1 21h
My colleague and I, working side by side on two different client machines, launched the frontend in our respective Chrome browsers and both did some basic independent work. Surprisingly to us, we were both able to hit the same Pod ('gems-deployment-c9...') and work productively. Based on everything we understood on Pods and virtual machines, this did not make sense to us. I will note here we are reasonably new to the Kubernetes and containers world.
Can anyone explain why this works? Are there any papers one could point us to? My Google searches have only turned up papers on multiple containers per Pod, which is not our case.
Thanks in advance!
When you make a regular HTTP GET request to your Angular app, the Load Balancer (Service) determines which pod should take the request based on the strain on the nodes. Because you have one frontend pod, it only resides on one node, so that node is selected.
The node then creates a new thread that lasts as long as it takes to send you a response, which is the webpage. If your requests were simultaneous, the node would typically just open another thread and serve you both simultaneously, which is the same thing that would happen if you were hosting this frontend off of Kubernetes.
For what you're worrying about to become a problem, the Node would have to be receiving more requests than it could effectively deal with. In that case, optimizing the Load Balancer or improving Node power are both valid responses.