Search code examples
kubernetesterraformminiocoreos-ignitionfedora-coreos

Self-hosted reprovision of S3 cluster without downtime


My Goal

I am currently experimenting with K8s at home on my homelab. I am working towards creating some IaC with Terraform to provision a K8s APPS cluster made of Fedora CoreOS nodes.

I thought a good plan to make a K8S node provisioning fully automated would be to:

  1. Let Terraform generate a CoreOS Ignition script for the OS provision config.
  2. Use Terraform to attach my Ignition file to my custom build ISO of Fedora CoreOS (built using CoreOS Assembler).
  3. Upload the ISO to a local object storage using min.io (probably using Terraform).
  4. Provision every node using Terraform using the ISO URL.
  5. Let the OS installation happen with Ignition.
  6. Further setup after provisioning done using Ansible.
  7. APPS cluster provisioning completed.

APPS cluster would be used to run all my containers such as Jellyfin & co.

How can I avoid creating redundant s3 clusters?

My understanding is that for an optimal solution, min.io (s3) should not be on the APPS cluster but on a different K8s one. So that if I need to provision the APPS cluster again I can destroy all the nodes without worries.

If both clusters are fedora core OS, how can I avoid to create a second s3 cluster just to store ISO when I need to reprovision my first s3 cluster?

Extra notes

  • Please feel free to highlight anything that may sound wrong to you.

Solution

  • Your goal sounds good. A little bit like bootstrapping OpenShift 4 nowadays. Now, where to run MinIO is good question.

    Not knowing how you use s3, beyond booting those VMs, nor what hypervisor/which terraform providers are used, I'll keep it generic:

    1/ First question would be: do you need MinIO?

    Bootstrapping your Kubernetes cluster from MinIO, you're hitting with a chicken-and-egg problem. Pulling your initial images & ignition files, the bare minimum would be some HTTP server.

    1.1/ MinIO used by other non-kubernetes services

    Assuming MinIO is somehow required for other bare-metal-related stuff: given it's required bootstrapping Kubernetes, I would consider hosting MinIO as a bare-metal application (or using good-old-virtualization), outside of Kubernetes.

    Going there, I might even consider the option of MinIO alternatives, like Ceph: offering with both object and block storage, could be useful setting up dynamically-provisioned PVCs. (warning: the minimal Ceph cluster is usually larger than the bare-minimum-MinIO config. but if you're about to drop one of your k8s cluster, maybe that makes sense, ...)

    1.2/ Try not to rely on MinIO for cluster bootstrap

    Unless there's a reason to have MinIO involved in VMs bootstrap, I would just serve my ISO & ignition files with some nginx, lighttpd or apache.

    Easier to maintain, re-create, ... And you might be able to configure that http server on whichever host generates your custom ISOs.

    1.3/ MinIO required by kubernetes-hosted applications

    Assuming your mostly use MinIO with your application: keep it in Kuberenetes

    You don't even need a dedicated Kubernetes cluster: you may use nodeSelectors, taints, tolerations, ... such as your MinIO Pods would run on somewhat-dedicated Kubernetes workers, while your applications would run on "regular" nodes.

    You can use separate namespaces, setup cluster RBAC, maybe even NetworkPolicies (assuming your SDN supports this) ... in a way that would isolate your storage from your workloads.

    2/ Next question: do you HAVE TO re-create nodes (clusters?), redeploying your application?

    Beyond testing your code, there isn't much value to re-creating everything: with Kubernetes, re-deploying an app from scratch should not require more than resetting your PVCs -- and maybe re-creating all objects, maybe delete/re-create the namespace hosting your applications.

    Deploying and re-deploying workloads in Kubernetes "should" not require destroying and re-creating your k8s cluster itself, nor any of its nodes.

    Still: you shouldn't be afraid to re-create nodes. Workers are disposable, if you want to destroy them, create new ones re-joining your cluster: makes perfect sense. Running on cloud (aws/azure/openstack/gce/...), we would usually setup some autoscaler: they will destroy instances and pop new ones, according to overall cluster resources usage. Scaling your clusters in and out is perfectly normal.