I have a pyspark program which connects to MySQL db successfully and reads a table. Now, I am trying to pass the database credentials from a properties file, instead of embedding them in the code, but not able to make it work.
from pyspark.sql import SparkSession
from pyspark.sql.types import *
#spark-submit –packages mysql:mysql-connector-java:8.0.13 workWithMySQL.py
spark = SparkSession.builder.appName(“MySQL connection”).getOrCreate()
#create spart context from spark session
sc = spark.sparkContext
#read from mysql
#configuration details
hostname=”localhost”
jdbcport=3306
dbname=”TEST”
username=”kanchan@localhost”
password=”password”
mysql_url = “jdbc:mysql://{0}:{1}/{2}?user={3}&password={4}”.format(hostname,jdbcport,dbname,username,password)
mysql_driver = “com.mysql.jdbc.Driver”
query = “(select * from cats) t1_alias”
df4 = spark.read.format(“jdbc”).options(driver=mysql_driver,url=mysql_url, dbtable=query).load()
df4.show()
Now, I have created a properties file jdbc.properties at $SPARK_HOME/conf
spark.mysql.user kanchan@localhost
spark.mysql.password password
And add it in spark-submit call
spark-submit –packages mysql:mysql-connector-java:8.0.13 --files $SPARK_HOME/conf/jdbc.properties workWithMySQL.py
replaced the assignments:
username=sc.getConf.getOption("spark.mysql.user")
password=sc.getConf.getOption("spark.mysql.user")
when run. it throws an error saying the function has no attribute as get option. I could not locate the appropriate documentation for it. Can anyone help?
Further, is it possible to encrypt the credentials or ensure data security by any other means?
The method getOption
should be replace with the method get
.
username=sc.getConf().get("spark.mysql.user")