Search code examples
databricksazure-databricksdata-profiling

Databricks : Export data profiling report


Databricks can create a data profiling report after using the display(dataframe_name).
I have created a data profiling report using Azure Databricks but I do not know how do I export it. Can you please suggest How to export/download this report to my local system?


Solution

    • There is no direct option to download the data profiling report from Azure Databricks to local machine in a tabular format.

    • Data profiling itself is a new feature that was introduced to reduce manual work that is needed to summarize the statistics of our dataframes.

    • And as specified in this official Microsoft documentation, we can only add the data profile to our dashboard.

    • There are also no other API's that can be used to download this data in tabular format.

    • As a possible workaround, it might be possible to complete this operation manually using pandas/ pandas on spark API to calculate all the required attributes.

    • In general, some of these stats can be directly obtained using df.describe as shown below. Here df is a pyspark dataframe:

    enter image description here