I have couple of simple tasks which could take maximum 20 seconds to complete, so I decided to use separate thread to accomplish them. I want thread to do the job and update database with result.
While it works (no exceptions yet) I have lack of understanding Flask internals and how it works with WSGI server. I'm not quite sure that on some amount of parallel requests it won't end with some database access error.
Simplified code:
from time import time, sleep
from threading import Thread
from peewee import *
from playhouse.shortcuts import model_to_dict
from flask import Flask, abort, jsonify
db = SqliteDatabase("test.db")
Flask(__name__)
class Task(Model):
status = IntegerField(default=0)
result = TextField(null=True)
class Meta:
database = db
def do_smth(task_id):
start = time()
sleep(10)
# DATABASE UPDATE HERE
Task.update({Task.status: 1, Task.result: f"{start} - {time()}"})\
.where(Task.id == task_id).execute()
@app.route("/new")
def new_task():
try:
task = Task.create()
except IntegrityError:
abort(500)
else:
Thread(target=do_smth, args=(task.id,)).start()
return jsonify(model_to_dict(task))
@app.route("/get/<int:task_id>")
def get_task(task_id):
try:
task = Task.get(Task.id == task_id)
except Task.DoesNotExist:
abort(404)
else:
return jsonify(model_to_dict(task))
@app.before_request
def before_request():
db.connect()
@app.after_request
def after_request(response):
db.close()
return response
if __name__ == "__main__":
with db:
db.create_tables([Task])
app.run(host="127.0.0.1", port=5000)
As it suggested in peewee tutorial I added custom Flask.before_request
and Flask.after_request
which open and close database connection.
So the question is how to update database from separate thread safely? I have had an idea to add route which will update database and send request from thread, but I find it kinda dumb.
P.S. I've tried my best trying to be precise, but if something is unclear I will try to clarify it, just ask it in comments section.
This is a good question:
how to update database from separate thread safely?
With Sqlite you have to remember that it only allows one writer at a time. So you have to manage your connections carefully to ensure that you are only doing a write txn when you have to, and that you're committing it as soon as you're done with it.
Since you're opening and closing the DB during the lifetime of a request, and running your DB operations in separate thread(s), you should be OK for a smallish number of operations (100?). I think the main thing I'd be careful about is, during your task body, be sure you're only holding that write txn open for as short a time as possible:
def do_smth(task_id):
# open a database connection. it will be read-only for now.
with db.connection_context():
start = time()
sleep(10)
with db.atomic() as txn: # here is write tx, keep this brief!
Task.update({Task.status: 1, Task.result: f"{start} - {time()}"})\
.where(Task.id == task_id).execute()
See the first section on transactions: https://charlesleifer.com/blog/going-fast-with-sqlite-and-python/
For a more drastic approach, you can try this: https://charlesleifer.com/blog/multi-threaded-sqlite-without-the-operationalerrors/