I'm working on a report generation with Spark and I need to be able to somehow add a column with constant value into a Dataset created with Dataset.select()
and then flushed into CSV file:
private static void buildReport(FileSystem fileSystem, Dataset<Row> joinedDs, String reportName) throws IOException {
Path report = new Path(reportName);
joinedDs.filter(aFlter)
.select(
joinedDs.col("AGREEMENT_ID"),
//... here I need to insert a column with constant value
joinedDs.col("ERROR_MESSAGE")
)
.write()
.format("csv")
.option("header", true)
.option("sep", ",")
.csv(reportName);
fileSystem.copyToLocalFile(report, new Path(reportName + ".csv"));
}
I don't want to insert the column manually into created CSV file, I'd like to have the column there at file creation time.
You can add it with lit function during select
.select(
joinedDs.col("AGREEMENT_ID"),
lit("YOUR_CONSTANT_VALUE").as("YOUR_COL_NAME"),
joinedDs.col("ERROR_MESSAGE")
)