I have this project idea which involves using the spotify API to load users daily listening history and presenting a weekly wrapped to the user by mail, I need some advice on the architecture. My plan is to use Airflow for running daily tasks and loading the data to postgres and then running a weekly task to generate wrapped. This works for a single user , but how can make it scalable , or is their another way to build this service that i am missing?
Lets assume you have a list of users, and you want to fetch their data from spotify API.
This works for a single user , but how can make it scalable
By scalable, I understand that you want to fetch the data efficiently, and here are some options:
Whatever the method you choose, you can store the data in a remote storage, then process them in one scalable task (spark/trino for ex) to create reports and send them in a third task.