I'm trying to process considerable number of records using cloud dataflow. My source is google cloud storage and my sink is cloud SQL(MySQL). I have the following code to write to the sink(Cloud SQL).
p.apply()
....
.withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
"com.mysql.cj.jdbc.Driver", "jdbc:mysql://google/<DBNAME>?cloudSqlInstance=<INSTANCE_NAME>&socketFactory=com.google.cloud.sql.mysql.SocketFactory&user=<USERNAME>&password=<PASSWORD>&useSSL=false"
)
)
The above works fine when I run the pipeline using DirectRunner
. But it throws a NullPointer Exception
when run on a DataflowRunner
. The exception is as follows:
java.lang.NullPointerException
at org.apache.beam.sdk.io.jdbc.JdbcIO$Write$WriteFn.executeBatch(JdbcIO.java:775)
at org.apache.beam.sdk.io.jdbc.JdbcIO$Write$WriteFn.finishBundle(JdbcIO.java:755)
Beam Version = 2.16.0, 2.15.0 - tried both versions but failed. Any reason why this happens?
What's the solution to make it work with DataflowRunner
?
I have resolved this. The NUllPointerException
is for another reason in the finishBundle