Search code examples
pythonmultithreadingcurlhttp-status

Multi threading script for HTTP status codes


Hi Stackoverflow community,

I would like to create a script that uses multi threading to create a high number of parallel requests for HTTP status codes on a large list of URL's (more than 30k vhosts).

The requests can be executed from the same server where the websites are hosted.

I was using multithreaded curl requests, but I'm not really satisfied with the results I've got. For a complete check of 30k hosts it takes more than an hour.

I am wondering if anyone has any tips or is there a more performant way to do it?


Solution

  • After testing some of the available solutions, the simplest and the fastest way was using webchk

    webchk is a command-line tool developed in Python 3 for checking the HTTP status codes and response headers of URLs

    The speed was impressive, the output was clean, it parsed 30k vhosts in about 2 minutes

    https://webchk.readthedocs.io/en/latest/index.html

    https://pypi.org/project/webchk/