Search code examples
scalaapache-sparknotserializableexception

Problem with creating Auxiliary tail recursion in UDF of Spark Scala


I am working on a small project where I am taking data from kafka and send each record to UDF. In UDF we have while loop code which I need to replace with tail recursion.

while (condition) {
    fields
    body
}

to

def whileReplacement(dummy: Int): Int = {
    if(!condition) return 1
    body
    return parseExtTag(dummy)
}

But I am getting java.io.NotSerializableException. I do not understand what causing the error and how to solve it. If you have any better approach to solve this please provide it. Thanks you


Solution

  • Before I Just declare and call the recursive function in the UDF function it self.

    The problem was solved by placing the recursive function outside of the UDF function. I think by providing the function to the spark executors solves the serialization problem in this case. This is Just my understanding, I am not sure about what was really happening. If any one know please explain it.