I'm writing a flask API in pycharm. When I run my code locally, requests using boto3 to get secrets from secrets manager take less than a second. However, when I put my code on an EC2, it takes about 3 minutes (tried in both t2.micro and m5.large).
At first I thought it could be a Python issue, so I ran it in my EC2s through the awscli using:
aws secretsmanager get-secret-value --secret-id secretname
It sill took about 3 minutes. Why does this happen? Shouldn't this in theory be faster in an EC2 than in my local machine?
EDIT: This only happens when the EC2 is inside a VPC different than the default VPC.
After fighting with this same issue on our local machines for almost two months, we finally had some forward progress today.
It turns out the problem is related to IPv6.
If you're using IPv6, then the secrets manager domain will resolve to an IPv6 address. For some reason the cli is unable to make a secure connection using IPv6. After it times out, the cli falls back to IPv4 and then it succeeds.
To verify if you're resolving to an IPv6 address, just ping secretsmanager.us-east-1.amazonaws.com. Don't worry about the ping response, you just want to see the IP address the domain resolves to.
To fix this problem, you now have 3 options:
--cli-connect-timeout 1
Ultimately, option 1 is the real solution, but since it is so broad, the others might be easier.
Hopefully this helps someone else maintain a bit of sanity when they hit this.