Search code examples
amazon-cloudwatchamazon-eks

AWS EKS Amazon CloudWatch Observability: AmazonCloudWatchAgent CR: resource name may not be empty


I'm trying to setup logs collection from AWS EKS to AWS CloudWatch in my EKS.

Everything is latest available versions. I couldn't find any anomalies or other errors.

EKS addons installed, node group role has CloudWatchAgentServerPolicy attached.

It does work and i can see logs, but the

service/amazon-cloudwatch-observability-webhook-service spams Cloud watch with this error:

{
"level": "error",
"ts": "2024-05-24T05:01:42Z",
"msg": "Reconciler error",
"controller": "dcgmexporter",
"controllerGroup": "cloudwatch.aws.amazon.com",
"controllerKind": "DcgmExporter",
"DcgmExporter": {
 "name": "dcgm-exporter",
 "namespace": "amazon-cloudwatch"
},
"namespace": "amazon-cloudwatch",
"name": "dcgm-exporter",
"reconcileID": "d97d773b-f857-4f98-a20b-a3648e8c53a4",
"error": "failed to apply status changes to the AmazonCloudWatchAgent CR: resource name may not be empty",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"
}

Or this:

{
  "level": "error",
  "ts": "2024-05-24T05:46:35Z",
  "msg": "Reconciler error",
  "controller": "neuronmonitor",
  "controllerGroup": "cloudwatch.aws.amazon.com",
  "controllerKind": "NeuronMonitor",
  "NeuronMonitor": {
    "name": "neuron-monitor",
    "namespace": "amazon-cloudwatch"
  },
  "namespace": "amazon-cloudwatch",
  "name": "neuron-monitor",
  "reconcileID": "8ad31ba9-a6a5-4206-adc5-c24ece9086f9",
  "error": "failed to apply status changes to the AmazonCloudWatchAgent CR: resource name may not be empty",
  "stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"
}

Any advice where to start digging would be appreciated.

Update 1:

the error logs are coming from the pod amazon-cloudwatch-observability-controller-manager-xxxxxxxx


Solution

  • According to this pull request, this has been fixed in the latest version.