Search code examples
amazon-web-serviceskubernetesamazon-cloudwatchamazon-eksaws-cloudwatch-log-insights

Container Insights on Amazon EKS AccessDeniedException


I'm trying to add a Container Insight to my EKS cluster but running into a bit of an issue when deploying. According to my logs, I'm getting the following:

[error] [output:cloudwatch_logs:cloudwatch_logs.2] CreateLogGroup API responded with error='AccessDeniedException'
[error] [output:cloudwatch_logs:cloudwatch_logs.2] Failed to create log group 

The strange part about this is the role it seems to be assuming is the same role found within my EC2 worker nodes rather than the role for the service account I have created. I'm creating the service account and can see it within AWS successfully using the following command:

eksctl create iamserviceaccount --region ${env:AWS_DEFAULT_REGION} --name cloudwatch-agent --namespace amazon-cloudwatch --cluster ${env:CLUSTER_NAME} --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy --override-existing-serviceaccounts --approve

Despite the serviceaccount being created successfully, I continue to get my AccessDeniedException.

One thing I found was the logs work fine when I manually add the CloudWatchAgentServerPolicy to my worker nodes, however this is not the implementation I would like and instead would rather have an automative way of adding the service account and not touching the worker nodes directly if possible. The steps I followed can be found at the bottom of this documentation.

Thanks so much!


Solution

  • For anyone running into this issue: within the quickstart yaml, there is a fluent-bit service account that must be removed from that file and created manually. For me I created it using the following command:

    eksctl create iamserviceaccount --region ${env:AWS_DEFAULT_REGION} --name fluent-bit --namespace amazon-cloudwatch --cluster ${env:CLUSTER_NAME} --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy --override-existing-serviceaccounts --approve
    

    Upon running this command and removing the fluent-bit service account from the yaml, delete and reapply al your amazon-cloudwatch namespace items and it should be working.