K8s ANY load balancing?

I have been getting more and more into k8s and I am trying some stuff on a personal VPS I have. I have created a Deployment of a POD that uses another Service internally. I would love to validate that these two services are being somewhat loadbalanced.

Here was my attempt to create this: I have a simple service which I called metric-test which only has one endpoint that counts how many times it was called, logs it and returns this information. For this I used the microframework jooby since I had familiarity with it and could get a fast start.

The code of this simple app can be found on github

Also on the repository I have added the deployment.yaml file with which I use to push it to my local version of minikube (which simulates my k8s env).

Steps taken:

Use this cmd to compile docker image into minikube repo: eval $(minikube docker-env)
I now build the project's docker image with docker build . -t metric-test1
I then apply the deployment file with kubectl apply -f deployment.yaml (the file is also on the github link

This gives me a Service of type ClusterIP (which is what I want since it should not be accessible from outside) and 2 PODS containing the jooby code. Here is the deployment.yaml file:

apiVersion: v1
kind: Service
metadata:
  name: metric-test
  labels:
    run: metric-test
spec:
  ports:
    - port: 3000
      protocol: TCP
  selector:
    run: metric-test

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metric-test
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 3
  selector:
    matchLabels:
      run: metric-test
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        run: metric-test
    spec:
      containers:
        - image: metric-test1
          imagePullPolicy: Never
          name: data-api
          ports:
            - containerPort: 3000
              protocol: TCP
              name: 3000tcp
      restartPolicy: Always
      schedulerName: default-scheduler
      terminationGracePeriodSeconds: 30

Alright everything working! I just set up a port-forwarding so I can access the service: kubectl port-forward service/metric-test 3000:3000

And use this script to fire a lot of requests at the service:

#!/bin/bash
target=${1:-http://localhost:3000}
while true # loop forever, until ctrl+c pressed.
do
        for i in $(seq 100) # perfrom the inner command 100 times.
        do
                curl $target > /dev/null & # send out a curl request, the & indicates not to wait for the response.
        done

        wait # after 100 requests are sent out, wait for their processes to finish before the next iteration.
done

I am now seeing ALL of the requests being handled by only one of the pods while the other is just sitting there idle.

I went through the documentation (here) but to be honest I came out actually with more questions than answers. That is why I tried to create a simplified scenario to test these things out.

Questions

Can anyone help me out?
What am I missing? How can I achieve this load-balancing without exposing the service through the internet. I just want it to be available for other pods in the cluster.
(Bonus) The front-end services that are being served by an Ingress should all be properly load-balanced (right?)

Note: As far as I understood using LoadBalancer and Ingress you can actually achieve load-balancing, however, it also exposes to the outside.

EDIT 1

More info on the deployment: Result of kubectl get po

NAME                          READY   STATUS    RESTARTS   AGE
metric-test-f89bfbf86-ccrj8   1/1     Running   0          16h
metric-test-f89bfbf86-kl7qg   1/1     Running   0          16h

Here is a picture of the logs of both after running the curl script for a bit:

EDIT FOR SOLUTION

As m303945 said, the load balancing doesn't work out when we are using the port-forwarding.

In order to validate this, and any future tests I might want to do I did the following:

I've ran the following command in my terminal:

kubectl run -it --rm --restart=Never --image=alpine handytools -n ${1:-default} -- /bin/ash

which creates an alpine-based container and gives me shell access. At that point however, I cannot use curl since it is not installed. So for that I ran:

apk update
apk add curl

once I had that I've modified my previous bash script from above to run on this pod and try to hit the service I set up:

#!/bin/ash
target=${1:-http://metric-test:3000}
for i in $(seq 5) # loop 5 times to generate 500 calls.
do
        for i in $(seq 100) # perfrom the inner command 100 times.
        do
                curl $target > /dev/null & # send out a curl request, the & indicates not to wait for the response.
        done

        wait # after 100 requests are sent out, wait for their processes to finish before the next iteration.
done

The modifications included pointing to the service instead of the "localhost" and no port is needed. Also the alpine uses the ash and not the bash shell. I also ran 500 requests instead of infinity.

As you can see, running the above (which you can create using vi in alpine) I get a nice even load distribution!!!

Thanks again for user m303945 for pointing me in the right direction

Solution

If I remember correctly, TCP load balancing was not working when using port forwarding. Try to run the script from a container inside k8s instead of doing port forwarding.