Search code examples
rapache-sparkjdbcsparkr

How to write to JDBC source with SparkR 1.6.0?


With SparkR 1.6.0 I can read from a JDBC source with the following code,

jdbc_url <- "jdbc:mysql://localhost:3306/dashboard?user=<username>&password=<password>"

df <- sqlContext %>%
  loadDF(source     = "jdbc", 
         url        = jdbc_url, 
         driver     = "com.mysql.jdbc.Driver",
         dbtable    = "db.table_name")

But after performing a calculation, when I try to write the data back to the database I've hit a roadblock as attempting...

write.df(df      = df,
         path    = "NULL",
         source  = "jdbc",
         url     = jdbc_url, 
         driver  = "com.mysql.jdbc.Driver",
         dbtable = "db.table_name",
         mode    = "append")

...returns...

ERROR RBackendHandler: save on 55 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
  java.lang.RuntimeException: org.apache.spark.sql.execution.datasources.jdbc.DefaultSource does not allow create table as select.
    at scala.sys.package$.error(package.scala:27)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:259)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
    at org.apache.spark.sql.DataFrame.save(DataFrame.scala:2066)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
    at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:86)
    at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:38)
    at io.netty.channel.SimpleChannelIn

Looking around the web I found this which tells me that a patch for this error was included as of version 2.0.0; and we also get the functions read.jdbc and write.jdbc.

For this question, though, assume I'm stuck with SparkR v1.6.0. Is there a way to write to JDBC sources (i.e. is there a workaround that would allow me to use DataFrameWriter.jdbc() from SparkR)?


Solution

  • The short answer is, no, the JDBC write method was not supported by SparkR until version 2.0.0.