I want to pass two argument (let say x and y) to a pyspark udf.
#I want to pass x and y as argument
@udf (returnType=StringType())
def my_udf(str,x,y):
return some_result
#Now call the udf on pyspark dataframe (df)
#I don't know how we can pass two arguemnt x and y here while calling udf
df.withColumn('new_col_name',my_udf(df.col,x,y))
To pass the variable to pyspak UDF ,you can use lit functiond from pyspark.sql.functions module.This allows us to pass constant values as arguments to UDF.
from pyspark.sql.functions import lit
@udf (returnType=StringType())
def my_udf(str,x,y):
return some_result
#Now call the udf on pyspark dataframe (df)
#I don't know how we can pass two arguemnt x and y here while calling udf
df.withColumn('new_col_name',my_udf(df.col,lit(x),lit(y)))
Hope this helps.