Search code examples
amazon-web-serviceselasticsearchamazon-cognitoamazon-elasticsearchamazon-opensearch

AWS Opensearch (Elasticsearch 7.10) - Refuses to assume Cognito group role


So, I have two Elasticsearch 7.10 clusters.

I have a Cognito user pool, with an admin group. This admin group a role attached to it from IAM, call it the AdminRole. It's precedence is 1.

Now, I have configured both of the aforementioned Elasticsearch clusters to utilize Cognito authentication. They both use the same user pool, and the same identity pool.

That being said, when I log into the older cluster, click in the top right on my icon and then "view roles and identities" I see arn:aws:iam::{myaccountnumber}:role/cognito-AdminRole.

However, whenever I try the same in the new cluster, I see arn:aws:iam::{myaccountnumber}:role/cognito-auth-role. Why? Why is it picking up the cognito auth role instead of the role ascribed to the group?

I am logging in both times on the same account from the same cognito pool - that account is in the cognito group.

In both clusters, I have no backend roles referencing that auth role. If I add the cognito auth role as a master user (via ARN of course) then I can login fine on the newer cluster (the one that's setting my backend role to cognito-auth-role). When I remove the cognito-auth-role as a backend role from the all_access and security_manager roles, I stop being able to login to that cluster, with the fabled "missing role" error.

In both cases, the cognito admin group ARN stays as a backend role for all_access and security_manager.

In other words - how do I force the cluster to try to assume me the arn:aws:iam::{myaccountnumber}:role/cognito-AdminRole instead of the arn:aws:iam::{myaccountnumber}:role/cognito-auth-role? It's clearly possible, since the group's role is automatically assumed when I try to log in to the old cluster.


Solution

  • So - Genuinely, I have struggled with this for the last 40 hours at work, while doing other things.

    About 35 minutes after posting to Stack Overflow, I found the answer.

    The old cluster had a Authentication Role Selection rule in the identity pool. "Choose role from Token". I put the role resolution to "Deny", as it was in the old cluster, and now I have the same behaviour from both clusters.