I have this python code that uses the apscheduler
library to submit processes, it works fine:
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
array = [ 1, 3, 5, 7]
for elem in array:
scheduler.add_job(function_to_submit, kwargs={ 'elem': elem })
scheduler.start()
def function_to_submit(elem):
print(str(elem))
Note that the processes are submitted in parallel, and that the code does NOT wait for the processes to end.
What I need is to migrate this code to dask distributed
to use workers. The problem that I have is that if I use dask
submit method the code waits until all the functions end, and I need the code to continue. How to achieve that?
client = Client('127.0.0.1:8786')
future1 = client.submit(function_to_submit, 1)
future3 = client.submit(function_to_submit, 3)
L = [future1, future3]
client.gather(L) # <-- this waits until all the futures end
Dask distributed has a fire_and_forget
method which is an alternative to e.g. client.compute
or dask.distributed.wait
if you want the scheduler to hang on to the tasks even if the futures have fallen out of scope on the python process which submitted them.
from dask.distributed import Client, fire_and_forget
client = Client('127.0.0.1:8786')
fire_and_forget(client.submit(function_to_submit, 1))
fire_and_forget(client.submit(function_to_submit, 3))
# your script can now end and the scheduler will
# continue executing these until they end or the
# scheduler is terminated
See the dask.distributed docs on Futures for examples and other usage patterns.