Search code examples
pythonpandasdataframepysparkdatabricks

Convert pyspark dataframe to json file


I have a dataframe below and want to write that contents to a .json file.

Sample input dataframe and expected output json

And while creating output files , I do not want success part log files, so I tried to collect () the values from dataframe and used json_dumps() to create the file. But i am losing the column names and formats as opposed to the expected format in picture

Please help!


Solution

  • Used Json Normalize and resolved the issue