How Can One Kubernetes Pod Serve Multiple Users?

My colleague and I have a question on how one Kubernetes Pod can serve multiple users. Our belief is/was that one Pod serves a single end user. Here is our experiment.

We have a basic Angular Frontend talking to a .Net API Server that uses a MySQL PVClaim structure running on a Linode Kubernetes Cluster. The Cluster has 3 nodes with plenty of RAM, disk space, and CPUs. Here is a print out of 'kubectl get all':

C:\development\Kubernetes >kubectl get all

NAME                                  READY   STATUS    RESTARTS   AGE
pod/dataapi-server-57c87f56c6-2fcdz   1/1     Running   0          21h
pod/gems-deployment-c9f6dbdbb-2qhff   1/1     Running   0          20h
pod/mysql-6d4b9f6f46-8m66h            1/1     Running   0          21h

NAME                            TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE
service/dataapi-server          ClusterIP      10.128.6.155    <none>           80/TCP         21h
service/gems-frontend-service   LoadBalancer   10.128.76.185   104.***.**.**    80:31190/TCP   20h 
service/kubernetes              ClusterIP      10.128.0.1      <none>           443/TCP        4d23h
service/mysql                   ClusterIP      None            <none>           3306/TCP       21h

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/dataapi-server    1/1     1            1           21h
deployment.apps/gems-deployment   1/1     1            1           20h
deployment.apps/mysql             1/1     1            1           21h

My colleague and I, working side by side on two different client machines, launched the frontend in our respective Chrome browsers and both did some basic independent work. Surprisingly to us, we were both able to hit the same Pod ('gems-deployment-c9...') and work productively. Based on everything we understood on Pods and virtual machines, this did not make sense to us. I will note here we are reasonably new to the Kubernetes and containers world.

Can anyone explain why this works? Are there any papers one could point us to? My Google searches have only turned up papers on multiple containers per Pod, which is not our case.

Thanks in advance!

Solution

When you make a regular HTTP GET request to your Angular app, the Load Balancer (Service) determines which pod should take the request based on the strain on the nodes. Because you have one frontend pod, it only resides on one node, so that node is selected.

The node then creates a new thread that lasts as long as it takes to send you a response, which is the webpage. If your requests were simultaneous, the node would typically just open another thread and serve you both simultaneously, which is the same thing that would happen if you were hosting this frontend off of Kubernetes.

For what you're worrying about to become a problem, the Node would have to be receiving more requests than it could effectively deal with. In that case, optimizing the Load Balancer or improving Node power are both valid responses.