I am trying to run a Flask
app which consists of:
SQLalchemy
database1
and 2
as a background processFor that I have the following code:
import concurrent.futures
import queue
from concurrent.futures import ThreadPoolExecutor
from flask import Flask, current_app
app = Flask(__name__)
q = queue.Queue()
def build_cache():
# 1. Yielding API requests on the fly
track_and_features = spotify.query_tracks() # <- a generator
while True:
q.put(next(track_and_features))
def upload_cache(tracks_and_features):
# 2. Uploading each request to a `SQLalchemy` database
with app.app_context():
Upload_Tracks(filtered_dataset=track_and_features)
return "UPLOADING TRACKS TO DATABASE"
@app.route("/cache")
def cache():
# 3. Do `1` and `2` as a background process
with concurrent.futures.ThreadPoolExecutor() as executor:
future_to_track = {executor.submit(build_cache): "TRACKER DONE"}
while future_to_track:
# check for status of the futures which are currently working
done, not_done = concurrent.futures.wait(
future_to_track,
timeout=0.25,
return_when=concurrent.futures.FIRST_COMPLETED,
)
# if there is incoming work, start a new future
while not q.empty():
# fetch a track from the queue
track = q.get()
# Start the load operation and mark the future with its TRACK
future_to_track[executor.submit(upload_cache, track)] = track
# process any completed futures
for future in done:
track = future_to_track[future]
try:
data = future.result()
except Exception as exc:
print("%r generated an exception: %s" % (track, exc))
del future_to_track[future]
return "Cacheing playlist in the background..."
All of the above works, BUT NOT AS A BACKGROUND PROCESS. The app hangs when cache()
is called, and resumes only when the process is done.
I run it with gunicorn -c gconfig.py app:app -w 4 --threads 12
what am I doing wrong?
EDIT: If simplify things in order do debug this, and write simply:
# 1st background process
def build_cache():
# only ONE JOB
tracks_and_features = spotify.query_tracks() # <- not a generator
while True:
print(next(tracks_and_features))
# background cache
@app.route("/cache")
def cache():
executor.submit(build_cache)
return "Cacheing playlist in the background..."
THEN the process runs in the background.
However, if I add another job:
def build_cache():
tracks_and_features = spotify.query_tracks()
while True:
# SQLalchemy db
Upload_Tracks(filtered_dataset=next(tracks_and_features))
background does not work again.
In short:
Background only works if I run ONE job at a time (which was the limitation behind the idea of using queues in the first place).
seems like the problem is binding the background process to SQLalchemy, don't know. totally lost here.
Still not sure what you meant by
I mean the app waits for all requests to be made at login and only then goes to homepage. It should go right away to homepage with requests being made at background
There are a few issues here:
UploadTracks
is writing to the database, there might be a lock on the table. Check your indices and inspect lock waits in your database.UploadTracks
is waiting for the first to return its connection.In your first example, the endpoint is waiting on all futures to finish before returning, whereas in your second example, the endpoint returns immediately after submitting tasks to the executor. If you want flask to respond quickly while the tasks are still running in background threads, remove the with concurrent.futures.ThreadPoolExecutor() as executor:
and construct a global thread pool at the top of the module.
Using with
, the context manager waits for all submitted tasks before exiting, but I am not sure if that's your main issue.