I created a pvc, which dynamically creates a persistenvolume (using k3s with local-path) that gets used by a deployment. I am provisioning everything using terraform but encountered an error. The terraform apply enters a infinite loop while creating the pvc and pod. The pvc is in this state:
Name: grafana-pvc
Namespace: default
StorageClass: local-path
Status: Pending
Volume:
Labels: io.kompose.service=grafana-data
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: grafana-778c7f77c7-w7x9f
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 79s persistentvolume-controller waiting for first consumer to be created before binding
Normal WaitForPodScheduled 7s (x5 over 67s) persistentvolume-controller waiting for pod grafana-778c7f77c7-w7x9f to be scheduled
and the pod is in this state:
Name: grafana-778c7f77c7-w7x9f
Namespace: default
Priority: 0
Service Account: default
Node: <none>
Labels: io.kompose.service=grafana
pod-template-hash=778c7f77c7
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/grafana-778c7f77c7
Containers:
grafana:
Image: grafana/grafana:9.2.4
Port: 3000/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/etc/grafana from grafana-configuration (rw)
/var/lib/grafana from grafana-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-n7cmt (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
grafana-configuration:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: grafana-configuration
ReadOnly: false
grafana-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: grafana-pvc
ReadOnly: false
kube-api-access-n7cmt:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 90s default-scheduler 0/1 nodes are available: 1 persistentvolumeclaim "grafana-configuration" not found. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
Warning FailedScheduling 89s default-scheduler 0/1 nodes are available: 1 persistentvolumeclaim "grafana-configuration" not found. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
At this stage, nothing works anymore, terraform times out and I am not able to restore the state anymore.
My terraform files:
grafana.tf:
resource "kubernetes_persistent_volume_claim" "grafana-configuration" {
metadata {
name = "grafana-configuration"
labels = {
"io.kompose.service" = "grafana-configuration"
}
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "local-path"
resources {
requests = {
storage = "1Gi"
}
}
volume_name = "grafana-configuration"
}
}
resource "kubernetes_persistent_volume" "grafana-configuration" {
metadata {
name = "grafana-configuration"
}
spec {
storage_class_name = "local-path"
access_modes = ["ReadWriteOnce"]
capacity = {
storage = "1Gi"
}
node_affinity {
required {
node_selector_term {
match_expressions {
key = "node-role.kubernetes.io/master"
operator = "In"
values = ["true"]
}
}
}
}
persistent_volume_source {
local {
path = "/home/administrator/Metrics.Infrastructure/grafana/"
}
}
}
}
resource "kubernetes_persistent_volume_claim" "grafana-pvc" {
metadata {
name = "grafana-pvc"
labels = {
"io.kompose.service" = "grafana-data"
}
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "local-path"
resources {
requests = {
storage = "5Gi"
}
}
}
}
resource "kubernetes_deployment" "grafana" {
metadata {
name = "grafana"
labels = {
"io.kompose.service" = "grafana"
}
}
spec {
replicas = 1
selector {
match_labels = {
"io.kompose.service" = "grafana"
}
}
template {
metadata {
labels = {
"io.kompose.service" = "grafana"
}
}
spec {
volume {
name = "grafana-configuration"
persistent_volume_claim {
claim_name = "grafana-configuration"
}
}
volume {
name = "grafana-data"
persistent_volume_claim {
claim_name = "grafana-pvc"
}
}
container {
name = "grafana"
image = "grafana/grafana:9.2.4"
port {
container_port = 3000
}
volume_mount {
name = "grafana-configuration"
mount_path = "/etc/grafana"
}
volume_mount {
name = "grafana-data"
mount_path = "/var/lib/grafana"
}
}
restart_policy = "Always"
}
}
strategy {
type = "Recreate"
}
}
}
resource "kubernetes_service" "grafana" {
metadata {
name = "grafana"
labels = {
"io.kompose.service" = "grafana"
}
}
spec {
port {
port = 3000
target_port = 3000
node_port = 30001
}
type = "NodePort"
selector = {
"io.kompose.service" = "grafana"
}
}
}
prometheus.tf:
# We need these resources so that prometheus can fetch kubernetes metrics
resource "kubernetes_cluster_role" "prometheus-clusterrole" {
metadata {
name = "prometheus-clusterrole"
}
rule {
api_groups = [""]
resources = ["nodes", "nodes/proxy", "services", "endpoints", "pods"]
verbs = ["get", "list", "watch"]
}
rule {
api_groups = ["extensions"]
resources = ["ingresses"]
verbs = ["get", "list", "watch"]
}
rule {
non_resource_urls = ["/metrics"]
verbs = ["get"]
}
}
resource "kubernetes_cluster_role_binding" "prometheus_clusterrolebinding" {
metadata {
name = "prometheus-clusterrolebinding"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "ClusterRole"
name = "prometheus-clusterrole"
}
subject {
kind = "ServiceAccount"
name = "default"
namespace = "default"
}
}
resource "kubernetes_config_map" "prometheus-config" {
metadata {
name = "prometheus-config"
}
data = {
"prometheus.yml" = "${file("${path.module}/prometheus/prometheus.yml")}"
}
}
resource "kubernetes_persistent_volume_claim" "prometheus_data_claim" {
metadata {
name = "prometheus-data-claim"
labels = {
"io.kompose.service" = "prometheus-data"
}
}
spec {
access_modes = ["ReadWriteOnce"]
storage_class_name = "local-path"
resources {
requests = {
storage = "20Gi"
}
}
}
}
resource "kubernetes_deployment" "prometheus" {
metadata {
name = "prometheus"
labels = {
"io.kompose.service" = "prometheus"
}
}
spec {
replicas = 1
selector {
match_labels = {
"io.kompose.service" = "prometheus"
}
}
template {
metadata {
labels = {
"io.kompose.service" = "prometheus"
}
}
spec {
volume {
name = "prometheus-data"
persistent_volume_claim {
claim_name = "prometheus-data-claim"
}
}
volume {
name = "prometheus-config"
config_map {
name = "prometheus-config"
}
}
container {
name = "prometheus"
image = "prom/prometheus:v2.40.0"
args = [
"--config.file=/config/prometheus.yml",
"--storage.tsdb.path=/prometheus",
"--web.enable-lifecycle"
]
port {
container_port = 9090
}
volume_mount {
name = "prometheus-config"
mount_path = "/config"
}
volume_mount {
name = "prometheus-data"
mount_path = "/prometheus"
}
}
restart_policy = "Always"
}
}
strategy {
type = "Recreate"
}
}
}
resource "kubernetes_service" "prometheus" {
metadata {
name = "prometheus"
labels = {
"io.kompose.service" = "prometheus"
}
}
spec {
port {
port = 80
target_port = 9090
node_port = 30000
}
type = "NodePort"
selector = {
"io.kompose.service" = "prometheus"
}
}
}
In case you haven't solved your problem. I had the same issue and the fix was just a config change in the TF resource
resource "kubernetes_persistent_volume_claim_v1" "mypvc" {
metadata {
...
}
spec {
...
}
wait_until_bound = false
}
With that TF will not wait for the PVC to be consumed, and it will be able to create the next resource which does consume the PVC