Search code examples
kuberneteskubernetes-pvc

return persistent volume (pv) capacity in integer instead of Gi, Mi, Ki, G, M, K etc


I would like to calculate the total number of bytes allocated by the persistent volumes (PVs) in a cluster. Using the following:

$ kubectl get pv -A -o json

I can get a JSON list of all the cluster's PVs and for each PV in the items[] list one can read the spec.capacity.storage key to access the necessary information. See example below:

{
  "apiVersion": "v1",
  "kind": "PersistentVolume",
  "spec": {
    "accessModes": [
      "ReadWriteOnce"
    ],
    "capacity": {
      "storage": "500Gi"
    },
    "claimRef": {
      "apiVersion": "v1",
      "kind": "PersistentVolumeClaim",
      "name": "s3-storage-minio",
      "namespace": "default",
      "resourceVersion": "515932",
    },
    "persistentVolumeReclaimPolicy": "Delete",
    "volumeMode": "Filesystem",
  },
  "status": {
    "phase": "Bound"
  }
},

However, the returned values can be represented in different suffix (storage as a plain integer or as a fixed-point number using one of these suffixes: E, P, T, G, M, K. Or similarly, power-of-two equivalents: Ei, Pi, Ti, Gi, Mi, Ki).

Is there a neat way to request the capacity in integer format (or any other format but consistent among all the PVs) using the kubectl?

Otherwise, transforming different suffix to a common one in Bash looks like not very straightforward.

Thanks in advance for your help.


Solution

  • I haven't found a way to transform a value in .spec.capacity.storage using purely kubectl.


    I've managed to create a code with Python and it's Kubernetes library to extract the data and calculate the size of all used PV's. Please treat this code as an example and not production ready:

    from kubernetes import client, config
    import re 
    
    config.load_kube_config() # use .kube/config
    v1 = client.CoreV1Api()
    
    multiplier_dict = {"k": 1000, "Ki": 1024, "M": 1000000, "Mi": 1048576 , "G": 1000000000, "Gi": 1073741824} # and so on ... 
    size = 0 
    
    # for i in v1.list_persistent_volume_claim_for_all_namespaces(watch=False).items: # PVC
    
    for i in v1.list_persistent_volume(watch=False).items: # PV
    
        x = i.spec.capacity["storage"] # PV
        # x = i.spec.resources.requests["storage"] # PVC
        y = re.findall(r'[A-Za-z]+|\d+', x)
        print(y)
    
        # try used if no suffix (like Mi) is used
        try: 
            if y[1] in multiplier_dict: 
                size += multiplier_dict.get(y[1]) * int(y[0])
        except IndexError:
                size += int(y[0])
        
    print("The size in bytes of all PV's is: " + str(size))
    

    Having as an example a cluster that has following PV's:

    • $ kubectl get pv
    NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM               STORAGECLASS   REASON   AGE
    pvc-6b5236ec-547f-4f96-8448-e3dbe01c9039   500Mi      RWO            Delete           Bound    default/pvc-four    hostpath                4m13s
    pvc-86d178bc-1673-44e0-9a89-2efb14a1d22c   512M       RWO            Delete           Bound    default/pvc-three   hostpath                4m15s
    pvc-89b64f93-6bf4-4987-bdda-0356d19d6f59   1G         RWO            Delete           Bound    default/pvc-one     hostpath                4m15s
    pvc-a3455e77-0db0-4cab-99c9-c72721a65632   10Ki       RWO            Delete           Bound    default/pvc-six     hostpath                4m14s
    pvc-b47f92ef-f627-4391-943f-efa4241d0811   10k        RWO            Delete           Bound    default/pvc-five    hostpath                4m13s
    pvc-c3e13d78-9047-4899-99e7-0b2667ce4698   1Gi        RWO            Delete           Bound    default/pvc-two     hostpath                4m15s
    pvc-c57fe2b0-013a-412b-bca9-05050990766a   10         RWO            Delete           Bound    default/pvc-seven   hostpath                113s
    

    The code would produce the output of:

    ['500', 'Mi']
    ['512', 'M']
    ['1', 'G']
    ['10', 'Ki']
    ['10', 'k']
    ['1', 'Gi']
    ['10']
    The size in bytes of all PV's is: 3110050074
    

    Adding to the whole answer remember that there could be differences on the request of a PVC and the actual PV size. Please refer to the storage documentation of your choosing on that regard.

    • pvc.yaml:
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: pvc
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 100M
    

    Part of the $ kubectl get pvc -o yaml output:

      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100M # <-- REQUEST
        <-- REDACTED --> 
      status:
        accessModes:
        - ReadWriteOnce
        capacity:
          storage: 1Gi # <-- SIZE OF PV
        phase: Bound
    

    Additional resources: