I wrote some tvguide scrapers in php. I run them from a script that is executed by a cronjob. This script is run every minute, and checks if a scraper needs to start. This way i can alter and manage these scraping jobs without having to modify the cron itself.
These scraping scripts vary in runtime, some take no more than 1 minute, and others can take up to 4 hours. When i run them one after another there is no problem. But when i try to run two script simultaniously - one or both scripts hang. Resulting in a email from the cron:
sh: line 1: 700865 Hangup /usr/local/bin/php /home/id789533/domains/erdesigns.eu/public_html/tvg_schedules/scraper.php --country=dk --provider=1 --scraper=tv2 2>&1
Where the /usr/local/..... is the command for the script, and which is called from the scheduler script. I just cant find anything related to this message, and i have no idea how to fix it. I can send the script itself if needed.
All advise and help would be apreciated.
[Edit] I also took a look at the resource usage, and the load never gets higher than 150mb and 15% load. I have a limit of 400% and 1GB.
I execute the scripts from the php script like so:
shell_exec(sprintf("/usr/local/bin/php %s 2>&1", $scraper));
where $scraper
is the filename. It executes the script like it should, but after a while i get the message sh: line 1: 000000 Hangup
I know for sure that it is not allocating to much memory, someone who can direct me to the right way? I dont know where to look right now.
PHP is a language intended for the web with features like a cap for maximum execution time to make sure scripts do not run indefinitely and thereby blocking resources. Therefore, PHP is not the best choice for this task.
If it is only a short script I would advice you to convert it into a BASH or Python script. However, if you want to stick to PHP, check your php.ini
file for settings restricting execution time.