Search code examples
kubernetespersistent-volumes

Where can I locate the actual files of Kubernates PV hostpath


I just created the following PersistantVolume.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: sql-pv
  labels:
    type: local
spec:
  storageClassName: standard
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/var/lib/sqldata"

Then I SSH the Node and traversed to the /var/lib. But I cannot see the sqldata directory created anywhere in it.

Where is the real directory created?

I created a POD that mounts this volume to a path inside the container. When I SSH the container, I can see the file in the mount path. Where are these files stored?


Solution

  • You have setup your cluster on Google Kubernetes Engine, that means nodes are virtual machine instances on GCP. You've probably been connecting to the cluster using the Kubernetes Engine dashboard and Connect to the cluster option. It does not SSH you to any of the node, it just starting GCP Cloud Shell terminal instance with following command like:

    gcloud container clusters get-credentials {your-cluster} --zone {your-zone} --project {your-project-name}
    

    That command is configuring kubectl agent on GCP Cloud Shell by setting proper cluster name, certificates etc. in ~/.kube/config file so you have access to the cluster (by communicating with the cluster endpoint), but you are not SSHed to any node. That's why you can't access the path defined in the hostPath.

    To find a hostPath directory, you need to:

    • find on which node is the pod
    • SSH into this node

    Finding a node:

    Run following kubectl get pod {pod-name} with -o wide flag command - change {pod-name} to your pod name

    user@cloudshell:~ (project)$ kubectl get pod task-pv-pod -o wide
    NAME          READY   STATUS    RESTARTS   AGE   IP            NODE                                         NOMINATED NODE   READINESS GATES
    task-pv-pod   1/1     Running   0          53m   xx.xx.x.xxx   gke-test-v-1-21-default-pool-82dbc10b-8mvx   <none>           <none>
    

    SSH to the node:

    Run following gcloud compute ssh {cluster-name} command - change {cluster-name} to node name from the previous command:

    user@cloudshell:~ (project)$ gcloud compute ssh gke-test-v-1-21-default-pool-82dbc10b-8mvx
    
    Welcome to Kubernetes v1.21.3-gke.2001!
    
    You can find documentation for Kubernetes at:
      http://docs.kubernetes.io/
    
    The source for this release can be found at:
      /home/kubernetes/kubernetes-src.tar.gz
    Or you can download it at:
      https://storage.googleapis.com/kubernetes-release-gke/release/v1.21.3-gke.2001/kubernetes-src.tar.gz
    
    It is based on the Kubernetes source at:
      https://github.com/kubernetes/kubernetes/tree/v1.21.3-gke.2001
    
    For Kubernetes copyright and licensing information, see:
      /home/kubernetes/LICENSES
    
    user@gke-test-v-1-21-default-pool-82dbc10b-8mvx ~ $ 
    

    Now there will be a hostPath directory (in your case /var/lib/sqldata), there will also be files if pod created some.


    Avoid hostPath if possible

    It's not recommended using hostPath. As mentioned in the comments, it will cause issues when a pod will be created on the different node (but you have a single node cluster) but it also presents many security risks:

    Warning: HostPath volumes present many security risks, and it is a best practice to avoid the use of HostPaths when possible. When a HostPath volume must be used, it should be scoped to only the required file or directory, and mounted as ReadOnly. If restricting HostPath access to specific directories through AdmissionPolicy, volumeMounts MUST be required to use readOnly mounts for the policy to be effective.

    In your case it's much better to use the gcePersistentDiskvolume type - check this article.