I am trying to add an Amazon RDS (Postgres 15) connection in AWS Glue:
My JDBC URL is
jdbc:postgresql://xxx.xxx.us-west-2.rds.amazonaws.com:5432/my_db
which I can successfully connect in pgAdmin 4.
However, when I test this connection by
I got "FAILED".
The IAM role I am using even has AmazonRDSFullAccess:
When I click "View log" button, the error log is not really helpful. It seems AWS has issue parsing the errors:
ERROR StatusLogger Unrecognized format specifier [d]
ERROR StatusLogger Unrecognized conversion specifier [thread] starting at position 25 in conversion pattern.
ERROR StatusLogger Unrecognized format specifier [thread]
And here is my Security Group config. Basically both inbound and outbound rules allow all traffic:
Security Group - Inbound rules:
Security Group - Outbound rules:
I am wondering what else could cause this connection issue? Thanks!
After more research, I found the issue is because AWS Glue is using an old JDBC driver. So I have to upload the latest driver manually to use.
Download latest JDBC driver at https://jdbc.postgresql.org/download/
As of today, I downloaded latest JDBC driver 42.6.0 Java 8 version which filename is postgresql-42.6.0.jar
.
Then I uploaded at s3://my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar
Update your JDBC connection in AWS Glue -> Connectors with
org.postgresql.Driver
s3://my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar
In order to use my own JDBC driver, I need add the following permissions to the IAM role:
CreateJob
,
DeleteJob
, GetJob
, GetJobRun
, StartJobRun
.iam:PassRole
.s3:DeleteObjects
,
s3:GetObject
, s3:ListBucket
, s3:PutObject
.The default AWSGlueServiceRole
policy has CreateJob
, DeleteJob
, GetJob
, GetJobRun
, StartJobRun
. So I just need provide the rest of access to a new policy AWSGlueServiceRole-MyPolicy
I created:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:ListBucket",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar",
"arn:aws:s3:::my-bucket"
]
},
{
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": [
"arn:aws:iam::xxxxxxxxxxxx:role/service-role/AWSGlueServiceRole-xxx"
]
}
]
}
Now the IAM role have
AWSGlueServiceRole
policyAWSGlueServiceRole-MyPolicy
(Note you do NOT need default AmazonRDSFullAccess
policy)
Note the "Test Connection" will still fail with error log
Caused by: com.amazonaws.services.glue.exceptions.InvalidInputException: Testing connections with custom drivers is not currently supported. Caused by: com.amazonaws.services.glue.exceptions.InvalidInputException: Testing connections with custom drivers is not currently supported.
However, the crawler should succeed!
Based on AWS Glue doc, Glue 4.0 is using PostgreSQL JDBC driver 42.3.6 which is quite new. However, I am not sure why the connection still failed.
Note my Amazon RDS is using Postgres 15 this case, and my show password_encryption;
returns scram-sha-256
. I do not need set password_encryption = 'md5';
like some users did.