Search code examples
pythonpython-3.xapache-sparkpyspark

How do I override a spark error message with my own unique description of the error


I'm using the following spark_reader method from SparkSession class DataProcessor

from pyspark.sql import SparkSession

def spark_reader(spark: SparkSession, options: dict) -> DataFrame:
        df = spark.read.load(**options)
        return df

When the folder it is reading from contain no parquet files, it throws the error message AnalysisException: Unable to infer schema for Parquet. It must be specified manually. but I want to change this to something more specific such as folder contains no <file_type> files - does anyone know how to change the error message?

I have tried to raise an exception within the spark_reader method but the Spark message is raised before it gets to the exception and throws the above mentioned error.


Solution

  • this is very basic you can do.

        def spark_reader(spark: SparkSession, options: dict) -> DataFrame:
            try:
                df = spark.read.load(**options)
                return df
            except AnalysisException:
                print("folder contains no <file_type> files")