I'm using https://github.com/springml/spark-salesforce to query against a salesforce api. It works fine for standard queries, but when I add the bulk options they've listed it hits the error I've listed below. Let me know if I'm making any basic mistakes, based on their documentation I believe this is the correct approach
Trying to use a bulk query against our API. Using the below SOQL statement
val account_soql = "select industry from account"
I get the following error when the bulk flag is attached and the object is set to account
Exception in User Class: java.lang.UnsupportedOperationException : Cannot create XMLStreamReader or XMLEventReader from a org.codehaus.stax2.io.Stax2ByteArraySource
I've tried both of the below as source queries and see the same issue
val account_data = sparkSession.read.format("com.springml.spark.salesforce").option("soql",account_soql).option("username", "username").option("password","password").option("sfObject","account").option("bulk","true").load()
val account_data = sparkSession.read.format("com.springml.spark.salesforce").option("soql",account_soql).option("username", "username").option("password","password").option("multiLine","true").option("sfObject","account").option("inferSchema","true").option("bulk","true").option("version","latest-version").load()
I am using the following api versions
force-partner-api-40.0.0.jar
force-wsc-40.0.0.jar
salesforce-wave-api-1.0.9.jar
spark-salesforce_2.11-1.1.1.jar
These are sourced from this article
I did try updating to the latest version of spark-salesforce (feb 2021) and got the following error
Command failed with exit code 1 - INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V)
Let me know if I can provide any other detail to assist
This is a problem with stax2 librery add woodstox-core-asl-4.4.1.jar file in dependet jars in glue job configurarion and it will sove this error.