I've written something to read the location of some lake files dynamically provided by a list called partition_paths
:
dfs = [spark.read.parquet(f'{l}') for l in partition_paths]
I will then combine all these dfs into one in the next line:
df = reduce(DataFrame.unionAll, dfs)
But it maybe possible that the partition_paths
are either built up incorrectly, or the location in the lake simply doesn't exist, so I need to error handle the first line of code. How can I do that so it won't just stop and would continue getting all the dfs?
I don't know what is spark or what you are trying to do but I think you don't have to use list comprehension. You should use a usual for loop instead.
dfs = []
for l in partition_paths:
try:
variable = spark.read.parquet(f'{l}')
dfs.append(variable)
except:
print("error occured")
df = reduce(DataFrame.unionAll, dfs)