Search code examples
kubernetesgoogle-kubernetes-enginestackdrivergoogle-cloud-stackdriver

Kubernetes HPA fails to detect a successfully published custom metric from Stackdriver


I'm trying to scale a Kubernetes Deployment using a HorizontalPodAutoscaler, which listens to a custom metrics through Stackdriver.

I'm having a GKE cluster, with a Stackdriver adapter enabled. I'm able to publish the custom metric type to Stackdriver, and following is the way it's being displayed in Stackdriver's Metric Explorer.

enter image description here

enter image description here

This is how I have defined my HPA:

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: External
    external:
      metricName: custom.googleapis.com|worker_pod_metrics|baz
      targetValue: 400
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-app-group-1-1

After successfully creating example-hpa, executing kubectl get hpa example-hpa, always shows TARGETS as <unknown>, and never detects the value from custom metrics.

NAME          REFERENCE                       TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
example-hpa   Deployment/test-app-group-1-1   <unknown>/400   1         10        1          18m

I'm using a Java client which runs locally to publish my custom metrics. I have given the appropriate resource labels as mentioned here (hard coded - so that it can run without a problem in local environment). I have followed this document to create the Java client.

private static MonitoredResource prepareMonitoredResourceDescriptor() {
        Map<String, String> resourceLabels = new HashMap<>();
        resourceLabels.put("project_id", "<<<my-project-id>>>);
        resourceLabels.put("pod_id", "<my pod UID>");
        resourceLabels.put("container_name", "");
        resourceLabels.put("zone", "asia-southeast1-b");
        resourceLabels.put("cluster_name", "my-cluster");
        resourceLabels.put("namespace_id", "mynamespace");
        resourceLabels.put("instance_id", "");

        return MonitoredResource.newBuilder()
                .setType("gke_container")
                .putAllLabels(resourceLabels)
                .build();
    }

What am I doing wrong in the above-mentioned steps please? Thank you in advance for any answers provided!


EDIT [RESOLVED]: I think I have had some misconfigurations, since kubectl describe hpa [NAME] --v=9 showed me some 403 status code, as well as I was using type: External instead of type: Pods (Thanks MWZ for your answer, pointing out this mistake).

I managed to fix it by creating a new project, a new service account, and a new GKE cluster (basically everything from the beginning again). Then I changed my yaml file as follows, exactly as this document explains.

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: test-app-group-1-1
  namespace: default
spec:
  scaleTargetRef:
    apiVersion: apps/v1beta1
    kind: Deployment
    name: test-app-group-1-1
  minReplicas: 1
  maxReplicas: 5
  metrics:
  - type: Pods                 # Earlier this was type: External
    pods:                      # Earlier this was external:
      metricName: baz                               # metricName: custom.googleapis.com|worker_pod_metrics|baz
      targetAverageValue: 20

I'm now exporting as custom.googleapis.com/baz, and NOT as custom.googleapis.com/worker_pod_metrics/baz. Also, now I'm explicitly specifying the namespace for my HPA in the yaml.


Solution

  • Since you can see your custom metric in Stackdriver GUI I'm guessing metrics are correctly exported. Based on Autoscaling Deployments with Custom Metrics I believe you wrongly defined metric to be used by HPA to scale the deployment.

    Please try using this YAML:

    apiVersion: autoscaling/v2beta1
    kind: HorizontalPodAutoscaler
    metadata:
      name: example-hpa
    spec:
      minReplicas: 1
      maxReplicas: 10
      metrics:
      - type: Pods
        pods:
          metricName: baz
          targetAverageValue: 400
      scaleTargetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: test-app-group-1-1
    

    Please have in mind that:

    The HPA uses the metrics to compute an average and compare it to the target average value. In the application-to-Stackdriver export example, a Deployment contains Pods that export metric. The following manifest file describes a HorizontalPodAutoscaler object that scales a Deployment based on the target average value for the metric.

    Troubleshooting steps described on the page above can also be useful.

    Side-note Since above HPA is using beta API autoscaling/v2beta1 I got error when running kubectl describe hpa [DEPLOYMENT_NAME]. I ran kubectl describe hpa [DEPLOYMENT_NAME] --v=9 and got response in JSON.