Search code examples
scalaapache-spark

Text data source supports only a single column, and you have 8 columns


This is the error I got when I tried to save a data frame to text:

org.apache.spark.sql.AnalysisException: Text data source supports only a single column, and you have 8 columns

This is the code:

df.write.text("/tmp/wt")

What I m doing wrong?


Solution

  • In Spark 1.6, the easiest solution is to use databricks' library and write:

    df.write.format("com.databricks.spark.csv").save("pathToFile.csv")
    

    If you do not want to use it, you can simply convert the rows of your dataframe into csv lines like this:

    df.rdd
      .map(_.toSeq.map(_+"").reduce(_+";"+_))
      .saveAsTextFile("pathToFile.csv")
    

    Note that if your fields can contain separators or quotes, you will have to add enclosing quotes and escape existing quotes, things that the library would do for you transparently.