Search code examples
amazon-web-servicesamazon-rdsaws-glue

Failed to connect Amazon RDS in AWS Glue Data Catalog's connection


I am trying to add an Amazon RDS (Postgres 15) connection in AWS Glue:

enter image description here

My JDBC URL is

jdbc:postgresql://xxx.xxx.us-west-2.rds.amazonaws.com:5432/my_db

which I can successfully connect in pgAdmin 4.

However, when I test this connection by

enter image description here

enter image description here

I got "FAILED".

enter image description here

The IAM role I am using even has AmazonRDSFullAccess:

enter image description here

When I click "View log" button, the error log is not really helpful. It seems AWS has issue parsing the errors:

ERROR StatusLogger Unrecognized format specifier [d]
ERROR StatusLogger Unrecognized conversion specifier [thread] starting at position 25 in conversion pattern.
ERROR StatusLogger Unrecognized format specifier [thread]

enter image description here

And here is my Security Group config. Basically both inbound and outbound rules allow all traffic:

Security Group - Inbound rules:

enter image description here

Security Group - Outbound rules:

enter image description here

I am wondering what else could cause this connection issue? Thanks!


Solution

  • After more research, I found the issue is because AWS Glue is using an old JDBC driver. So I have to upload the latest driver manually to use.

    Step 1. Download latest JDBC driver and upload to S3

    Download latest JDBC driver at https://jdbc.postgresql.org/download/ As of today, I downloaded latest JDBC driver 42.6.0 Java 8 version which filename is postgresql-42.6.0.jar.

    Then I uploaded at s3://my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar

    Step 2. Update the JDBC connection

    Update your JDBC connection in AWS Glue -> Connectors with

    • JDBC Driver Class name: org.postgresql.Driver
    • JDBC Driver S3 Path: s3://my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar

    enter image description here

    Step 3. Add new access for the AWS Glue IAM role that is being used

    In order to use my own JDBC driver, I need add the following permissions to the IAM role:

    • Grant permissions for the following job actions: CreateJob, DeleteJob, GetJob, GetJobRun, StartJobRun.
    • Grant permissions for IAM actions: iam:PassRole.
    • Grant permissions for Amazon S3 actions: s3:DeleteObjects, s3:GetObject, s3:ListBucket, s3:PutObject.
    • Grant service principal access to bucketin the IAM policy and your JDBC driver S3 location.

    The default AWSGlueServiceRole policy has CreateJob, DeleteJob, GetJob, GetJobRun, StartJobRun. So I just need provide the rest of access to a new policy AWSGlueServiceRole-MyPolicy I created:

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": [
                    "s3:GetObject",
                    "s3:PutObject",
                    "s3:ListBucket",
                    "s3:DeleteObject"
                ],
                "Resource": [
                    "arn:aws:s3:::my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar",
                    "arn:aws:s3:::my-bucket"
                ]
            },
            {
                "Effect": "Allow",
                "Action": [
                    "iam:PassRole"
                ],
                "Resource": [
                    "arn:aws:iam::xxxxxxxxxxxx:role/service-role/AWSGlueServiceRole-xxx"
                ]
            }
        ]
    }
    

    Now the IAM role have

    • The default AWSGlueServiceRole policy
    • My new policy AWSGlueServiceRole-MyPolicy

    (Note you do NOT need default AmazonRDSFullAccess policy)

    enter image description here

    Note the "Test Connection" will still fail with error log

    Caused by: com.amazonaws.services.glue.exceptions.InvalidInputException: Testing connections with custom drivers is not currently supported. Caused by: com.amazonaws.services.glue.exceptions.InvalidInputException: Testing connections with custom drivers is not currently supported.

    However, the crawler should succeed!

    enter image description here

    Extra Notes

    Based on AWS Glue doc, Glue 4.0 is using PostgreSQL JDBC driver 42.3.6 which is quite new. However, I am not sure why the connection still failed.

    Note my Amazon RDS is using Postgres 15 this case, and my show password_encryption; returns scram-sha-256. I do not need set password_encryption = 'md5'; like some users did.

    Reference