I'm trying to setup logs collection from AWS EKS to AWS CloudWatch in my EKS.
Everything is latest available versions. I couldn't find any anomalies or other errors.
EKS addons installed, node group role has CloudWatchAgentServerPolicy attached.
It does work and i can see logs, but the
service/amazon-cloudwatch-observability-webhook-service spams Cloud watch with this error:
{
"level": "error",
"ts": "2024-05-24T05:01:42Z",
"msg": "Reconciler error",
"controller": "dcgmexporter",
"controllerGroup": "cloudwatch.aws.amazon.com",
"controllerKind": "DcgmExporter",
"DcgmExporter": {
"name": "dcgm-exporter",
"namespace": "amazon-cloudwatch"
},
"namespace": "amazon-cloudwatch",
"name": "dcgm-exporter",
"reconcileID": "d97d773b-f857-4f98-a20b-a3648e8c53a4",
"error": "failed to apply status changes to the AmazonCloudWatchAgent CR: resource name may not be empty",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"
}
Or this:
{
"level": "error",
"ts": "2024-05-24T05:46:35Z",
"msg": "Reconciler error",
"controller": "neuronmonitor",
"controllerGroup": "cloudwatch.aws.amazon.com",
"controllerKind": "NeuronMonitor",
"NeuronMonitor": {
"name": "neuron-monitor",
"namespace": "amazon-cloudwatch"
},
"namespace": "amazon-cloudwatch",
"name": "neuron-monitor",
"reconcileID": "8ad31ba9-a6a5-4206-adc5-c24ece9086f9",
"error": "failed to apply status changes to the AmazonCloudWatchAgent CR: resource name may not be empty",
"stacktrace": "sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\\n\\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\\n\\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\\n\\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"
}
Any advice where to start digging would be appreciated.
Update 1:
the error logs are coming from the pod amazon-cloudwatch-observability-controller-manager-xxxxxxxx
According to this pull request, this has been fixed in the latest version.