Search code examples
pythonmysqlflaskflask-mysql

Best practices for retrieving the most recent data from a database and presenting it on a website


I have the following stack in my project:

Project Structure

Cronjobs are running every 15 minutes to populate the MySQL database with data from the API. At the same moment, the flask app renders this date. So, everytime an user visits the website, the table containing a big quantity of data is loaded. The request is made to the database to load the data. The data will be presented in the bootstrap table as a response.

I'd want users to be able to refresh the table data (by clicking on the refresh button) and always view the most recent data (rather than having to wait for the cronjob to restart and update the database). If I trigger to import the data from the API again whenever the user clicks on the refresh button, the operation will be extremely time consuming, and the entire data will be loaded for a number of minutes before the page can be shown again. What is the best way to refresh table data that has been loaded from the database?


Solution

  • This problem can be easily solved with a task queue that supports locking, e.g. huey (if you want something simple).

    Let's call the task that fetches the API and updates the local database: fetch_api(). It can be run by cron, or by a user.

    1. When it runs, first it acquires a specific lock, which means only one instance of fetch_api() task can run concurrently.
    2. Next is to check what was the last time the task has been successfully run and updated the local database. If it was within a given interval, e.g. 1 min from the last importing, then it skips, since the data can be seen as "fresh", so no need to update it once again.
    3. Otherwise it starts the importing procedure and updates the local database.
    4. Finally it dispatches a signal, that new data in ready.
    5. The clients detect the signal via polling or websocket and fetches the new rendered table.

    This way when an users clicks on the refresh button, a fetch_api() task can be just enqueued, the task runner will handle the rest. The key point is the locking, so only one task will fetch the API.