I am investigating a robust way to scan my Azure AKS clusters and randomly change the numbers of pods, allocated resources, throttling and if possible limit connections to other resources (E.g. database, queues, cache).
The idea is to have this running against any environment (test, QA, live)
My questions are:
Is there tooling for this already?
If this possible via CRON/ Azure pipelines?
This is part of my stress development work cycle that includes API integration and load testing to help find weakness and feedback ways we can improve our offering and teams reputation
Google "Kubernetes chaos engineering".
Look at Azure Chaos Studio https://azure.microsoft.com/en-us/products/chaos-studio/#overview
Create a chaos experiment that uses a Chaos Mesh fault to kill AKS pods with the Azure portal https://learn.microsoft.com/en-us/azure/chaos-studio/chaos-studio-tutorial-aks-portal