Search code examples
apache-sparkpysparkconcatenation

Concatenate two columns of spark dataframe with null values


I have two columns in my spark dataframe

First_name  Last_name
Shiva       Kumar
Karthik     kumar
Shiva       Null
Null        Shiva

My requirement is to add a new column to dataframe by concatenating the above 2 columns with a comma and handle null values too.

I have tried using concat and coalesce but I can't get the output with comma delimiter only when both columns are available

Expected output

Full_name
Shiva,kumar
Karthik,kumar
Shiva
Shiva

Solution

  • concat_ws concats and handles null values for you.

    df.withColumn('Full_Name', F.concat_ws(',', F.col('First_name'), F.col('Last_name'))