Search code examples
apache-sparkpysparkamazon-rdsmysql-connectorspark-jdbc

Pyspark Dataframe to AWS MySql: requirement failed: The driver could not open a JDBC connection


I want to write a pyspark dataframe into a MySQL table in AWS RDS, but I keep getting the error

pyspark.sql.utils.IllegalArgumentException: requirement failed: The driver could not open a JDBC connection. Check the URL: jdbc:mysql:mtestdb.ch4i3d3jc0yc.eu-central-1.rds.amazonaws.com

My code looks like this:

import os
import sys

spark = SparkSession.builder\
            .appName('test-app')\
            .config('spark.jars.packages', 'mysql:mysql-connector-java:8.0.28')\
            .getOrCreate()

properties = {'user':'admin', 'password':'password', 'driver':'com.mysql.cj.jdbc.Driver'}
resultDF.write.jdbc(url='jdbc:mysql:mtestdb.ch4i3d3jc0yc.eu-central-1.rds.amazonaws.com', table='mcm_objects', properties=properties)\
            .mode('append')\
            .save()

I also tried the url 'jdbc:mysql://mtestdb.ch4i3d3jc0yc.eu-central-1.rds.amazonaws.com', but then I get the error:

java.sql.SQLException: No database selected

No idea what I am doing wrong. Any help would be greatly appreciated


Solution

  • table should be {dbName}.{dbtable}:

    resultDF.write.jdbc(url='jdbc:mysql:mtestdb.ch4i3d3jc0yc.eu-central-1.rds.amazonaws.com', table='{dbname}.mcm_objects', properties=properties)\
                .mode('append')\
                .save()