Search code examples
amazon-web-servicesamazon-rdsaws-glue

How to connect glue job and RDS without catalog


I've been trying a lot to find a solution to establish connection between glue job and RDS postgresql but all the solutions are using glue catalog which I don't want to use.

I only want to establish a connection and send some data from glue job (spark script) to my RDS database. I have already created the tables in my RDS database now I want to just send the data to it. How should I approach it?

Also I found some articles/videos to do this using jdbc but none of them are using glue job script. Please help me out.


Solution

  • If you don't wish to use the Glue Catalog or the Glue connectors for any reason, you can just using python libraries like psycopg2 like in any vanilla python script. just make sure that you specified psycopg2-binary in the following parameter of the job config --additional-python-modules:

    import psycopg2
    
    # Connection details
    host = "your_rds_host"
    database = "your_database_name"
    user = "your_username"
    password = "your_password"
    port = "5432"  # Default PostgreSQL port
    
    # Query to execute
    query = """
    SELECT * FROM your_table LIMIT 10;
    """
    
    # Connect to the RDS instance
    with psycopg2.connect(
        host=host,
        database=database,
        user=user,
        password=password,
        port=port,
    ) as conn:
        # Create a cursor
        with conn.cursor() as cur:
            # Execute the query
            cur.execute(query)
    
            # Fetch
            results = cur.fetchall()