I am trying to Spark to Oracle. If my connection fails, job is failing. Instead, I want to set some connection retry limit to ensure its trying to reconnect as per the limit and then fail the job if its not connecting.
Please suggest on how we could implement this.
Let's assume you are using PySpark. Recently I used this in my project so I know this works.
I have used retry PyPi project retry 0.9.2
and its application passed through extensive testing process
I used a Python class to hold the retry related configurations.
class RetryConfig:
retry_count = 1
delay_interval = 1
backoff_multiplier = 1
I collected the application parameter from runtime configurations and set them as below:
RetryConfig.retry_count = <retry_count supplied from config>
RetryConfig.delay_interval = <delay_interval supplied from config>
RetryConfig.backoff_multiplier = <backoff_multiplier supplied from config>
Then applied the on the method call that connects the DB
@retry((Exception), tries=RetryConfig.retry_count, delay=RetryConfig.delay_interval, backoff=RetryConfig.backoff_multiplier)
def connect(connection_string):
print("trying")
obj = pyodbc.connect(connection_string)
return obj
Backoff will increase the delay by backoff multiplication factor with each retry - a quite common functional ask.
Cheers!!