Search code examples
apache-sparkapache-spark-sql

Unable to append "Quotes" in write for dataframe


I am trying to save a dataframe as .csv in spark. It is required to have all fields bounded by "Quotes". Currently, the file is not enclosed by "Quotes".

I am using Spark 2.1.0

Code :

DataOutputResult.write.format("com.databricks.spark.csv").
option("header", true).
option("inferSchema", false).
option("quoteMode", "ALL").
mode("overwrite").
save(Dataoutputfolder)

Output format(actual) :

Name, Id,Age,Gender

XXX,1,23,Male

Output format (Required) :

"Name", "Id" ," Age" ,"Gender"

"XXX","1","23","Male"

Options I tried so far :

QuoteMode, Quote in the options during it as file, But with no success.


Solution

  • ("quote", "all"), replace quoteMode with quote

    or play with concat or concat_wsdirectly on df columns and save without quote - mode

    import org.apache.spark.sql.functions.{concat, lit}
    
    val newDF = df.select(concat($"Name", lit("""), $"Age"))
    

    or create own udf function to add desired behaviour, pls find more examples in Concatenate columns in apache spark dataframe