So before I used kubernetes the general rule I used for running multiple express instances on a VM was one per cpu. That seemed to give the best performance. For kubernetes, would it be wise to have a replica per node cpu? Or should I let the horizontalpodautoscaler decide? The cluster has a node autoscaler. Thanks for any advice!
good question !
You need to consider 4 things :
Run the pod using Deployment so you enable replication, rolling update,...so on
Set resources.limits
to your container definition. this is mandatory for autoscaling , because HPA is monitoring the percentage of usage, and if there is NO limit, there will be NEVER percentage, then HPA will never reach the threshold.
Set resources.requests
. This will help the scheduler to estimate how much the app needs, so it will be assigned to the suitable Node per its current capacity.
Set HPA threshold: The percentage of usage (CPU, memory) when the HPA will trigger scale out or scale in.
for your situation, you said "one per cpu".. then, it should be:
containers:
- name: express
image: myapp-node
#.....
resources:
requests:
memory: "256Mi"
cpu: "750m"
limits:
memory: "512Mi"
cpu: "1000m" # <-- 🔴 match what you have in the legacy deployment
you may wonder why I put memory limits/requests without any input from your side ? The answer is that I put it randomly. Your task is to monitor your application, and adjust all these values accordingly.