UDF reload in PySpark

I am using PySpark (inside an Jupyter Notebook, which connects to a Spark-cluster) and some UDFs. The UDF takes a list as the an additional parameter and I construct the UDF like this:

my_udf = F.udf(partial(my_normal_fn, list_param=list), StringType())

Everything works fine, with regard to executing the function. But I noticed that the UDF is never updated. To clarify: When I update the list, for example by altering a element in the list, the UDF is not updated. The old version with the old list is still used. Even if I execute the whole notebook again. I have to restart the Jupyter Kernel in order to use new version of the list. Which is really annoying...

Any thoughts?

Solution

I found the solution.

My my_normal_fn did have the following signature:

def my_normal_fn(x, list_param=[]):
    dosomestuffwith_x_and_list_param

Changing it to

def my_normal_fn(x, list_param):
    dosomestuffwith_x_and_list_param

Did the trick. See here for more information.

Thanks to user Drjones78 of the SparkML-Slack channel.