I have spark UDF, which I call from the SQL statement. This is how I have defined and register UDF function:
UDF:
import org.apache.spark.sql.functions.udf
def udf_temp_emp (_type: String, _name: String): Int = {
val statement = s"select key from employee where empName = $_name"
val key = spark.sql(statement).collect()(0).getInt(0)
return key
}
spark.udf.register("udf_temp_emp", udf_temp_emp(_,_))
And this is how I call it in my SQL command:
select
udf_temp_emp(emp.type, emp.name),
emp.id
from
empMaster
And when I run above command, it throws below exception:
SparkException: [FAILED_EXECUTE_UDF] Failed to execute user defined function ($read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$Lambda$13841/67716274: (string, string) => string). Caused by: NullPointerException:
per werner's comment (should be an answer and the selected one at that) this isn't possible as the SparkContext / SparkSession doesn't exist on the executor nodes. The only time this "works" is if you use LocalRelations (such as a converted Seq).
Re-write your udf as a join, assuming that is the actual logic.