I'm using the following spark_reader method from SparkSession class DataProcessor
from pyspark.sql import SparkSession
def spark_reader(spark: SparkSession, options: dict) -> DataFrame:
df = spark.read.load(**options)
return df
When the folder it is reading from contain no parquet files, it throws the error message AnalysisException: Unable to infer schema for Parquet. It must be specified manually.
but I want to change this to something more specific such as folder contains no <file_type> files
- does anyone know how to change the error message?
I have tried to raise an exception within the spark_reader method but the Spark message is raised before it gets to the exception and throws the above mentioned error.
this is very basic you can do.
def spark_reader(spark: SparkSession, options: dict) -> DataFrame:
try:
df = spark.read.load(**options)
return df
except AnalysisException:
print("folder contains no <file_type> files")