I launched a fresh AWS EMR Spark cluster with Zeppelin on AWS to query an MYSQL database. When I tried to add an MYSQL interpreter in Zeppelin the option does not exist. I googled to find a way to get the interpreter to display but I didn't find a solution. How can I get the MYSQL interpreter in Zeppelin so I can query the MYSQL database?
Spark SQL supports many features of SQL:2003
and SQL:2011
[ 1][2], you may consider doing that that via Spark on Zeppelin by adding dependency.
You should be able to access a MySQL table right now. The following is an example using the API of Scala:
/* Database Configuration*/
val jdbcURL = s"jdbc:mysql://${HOST}/${DATABASE}"
val jdbcUsername = s"${USERNAME}"
val jdbcPassword = s"${PASSWORD}"
import java.util.Properties
val connectionProperties = new Properties()
connectionProperties.put("user", jdbcUsername)
connectionProperties.put("password", jdbcPassword)
connectionProperties.put("driver", "com.mysql.cj.jdbc.Driver")
/* Read Data from MySQL */
val desiredData = spark.read.jdbc(jdbcURL, "${TABLE NAME}", connectionProperties)
desiredData.printSchema
/* Data Manipulation */
desiredData.createOrReplaceTempView("desiredData")
val query = s"""
SELECT COUNT(*) AS `Record Number`
FROM desiredData
"""
spark.sql(query).show
val query2 = s"""
SELECT ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column1, column2) AS column3
FROM desiredData
"""
spark.sql(query2).show
.
.
.