Search code examples
pythonmultithreadingweb-servicesservice

Procedures to Trigger Continuously


I am working on a project. The project is to constantly check the sites that users will visit (such as checking every 3 minutes). I created my project from 2 different services. One is the service that will communicate with the user (I developed it with NodeJS). The other is a Python service that will perform background checks and write the necessary status notifications to the database.

I think Python is strong in this regard with its libraries. The issue I'm stuck on is this: suppose users enter too many addresses after a certain time. In this case, how can I do the control at the same time?

The first thing that came to my mind was thread and I created the following solution;

threads = []

for website in websites:
    thread = threading.Thread(target=check, args=(website[0],website[1],website[2],))
    thread.start()
    threads.append(thread)

for thread in threads:
    thread.join()

This code checks 200 URL addresses (test data) registered in the database. It uses Thread to control it at the same time, but since it uses too many Threads here, I think it is against the use of Thread. In short, I think there should be a healthier method of this method. Which path should I follow in this case? What methods do I need to modify and work on?

Thanks.


Solution

  • In short, I think there should be a healthier method of this method. Which path should I follow in this case? What methods do I need to modify and work on?

    In my view, a more scalable solution would involve implementing a task queue, considering the inherent limitations of a single computer's processing capacity. For example, if each instance of the service can handle up to 200 URLs simultaneously, and we have a total of 800 URLs to check, we can segment these URLs into batches of 200 and leverage a tool like Celery to distribute the workload efficiently.

    given the nature of network operations involved, each worker should implement parallelism techniques, such as threads or coroutines, to maximize performance.