Search code examples
kubernetesgarbage-collectionkubectl

How to recursively get dependent resources of Kubernetes owner resource


With Kubernetes you can use the Garbage Collector to automate the deletion of dependent resources when owning resources are removed. I'm wondering the easiest method to print out the dependency tree of an owning resource, potentially limiting to a tree depth if needs be.

I understand the potential for crashing the API service given the ability to fan out to all resources in a cluster and likely why this isn't an easy feat to achieve but I've been struggling to even find usable, community supported workarounds or even discussions/issues relating to this topic (likely my poor searching skills) so any help in achieving this would be great!


To make things more concrete a specific example of an abstract kubectl get query I'd like to achieve would be something like kubectl get scheduledworkflow <workflow name> --dependents:

  • This would find the Kubeflow Pipelines ScheduledWorkflow resource then recurse,
  • That would find all Argo Workflow resources,
  • Then for each Workflow resource many Pod and Volume resources (there are a few other types but wanted to paint the picture of these being disparate resource types).

We typically only keep a small number of Argo Workflow resources in the cluster at anyone one time as the majority of our Workflow's spawn 1k+ Pod so we have pretty aggressive GC policies in place. Even so listing these is just painful at the moment and need to use a custom script to do it but wondering if there was a higher level CLI, SDK or API available (or any group working on this issue in the community!).


Solution

  • There are no ready solutions for this.

    I see two options how this can be proceeded:


    1 - probably this is what you already mentioned: "need to use a custom script to do it".

    Idea is to get jsons of required resource groups and then process it by any available/known language like bash/python/java/etc and/or using jq. All dependent objects have ownerReference field which allows to match resources.

    More information about owners and dependents

    jq tool and examples


    2 - Write your own tool based on kubernetes garbage collector

    Kubernetes garbage collector works based on graph built by GraphBuilder:

    garbage collector source code

    Graph is always up to date by using `reflectors:

    GarbageCollector runs reflectors to watch for changes of managed API objects, funnels the results to a single-threaded dependencyGraphBuilder, which builds a graph caching the dependencies among objects

    graph_builder source code to get whole logic of it.

    Built graph has node type:

    graph data structure

    Also it's worth to mention that working with api server is more convenient using kubernetes clients libraries which are available for different languages.