I am saving my spark dataset as parquet file in my local machine. I would like to know if there are any ways I could encrypt the data using some encryption algorithm. The code I am using to save my data as parquet file looks something like this.
dataset.write().mode("overwrite").parquet(parquetFile);
I saw a similar question but my query is different as I am writing to my local disk.
I don't think you can do over Spark directly, however there are other projects you can put around Parquet, in special Apache Arrow. I think this video explains how to do it:
https://databricks.com/session_na21/data-security-at-scale-through-spark-and-parquet-encryption
UPDATE: since Spark 3.2.0 it seems possible.