I've got a site that displays data from a game server. The game has different "domains" (which are actually just separate servers) that the users play on.
Right now, I've got 14 cron
jobs running at different intervals of the hour every 6 hours. All 14 files that are run are pretty much the same, and each takes around 75 minutes ( an hour and 15 minutes ) to complete it's run.
I had thought about just using 1 file run from cron
and looping through each server, but this would just cause that one file run for 18 hours or so. My current VPS is set to only allow 1 vCPU
, so I'm trying to accomplish things and stay within my allotted server load.
Seeing that the site needs to have updated data available every 6 hours, this isn't doable.
I started looking into message queues and passing some information to a background process that will perform the work in question. I started off trying to use resque
and php-resque
, but my background worker died as soon as it was started. So, I moved on to ZeroMQ
, which seems to be more what I need, anyway.
I've set up ZMQ via Composer, and everything during the installation went fine. In my worker script (which will be called by cron every 6 hours), I've got:
$dataContext = new ZMQContext();
$dataDispatch = new ZMQSocket($dataContext, ZMQ::SOCKET_PUSH);
$dataDispatch->bind("tcp://*:50557");
$dataDispatch->send(0);
foreach($filesToUse as $filePath){
$dataDispatch->send($filePath);
sleep(1);
}
$filesToUse = array();
$blockDirs = array_filter(glob('mapBlocks/*'), 'is_dir');
foreach($blockDirs as $k => $blockDir){
$files = glob($rootPath.$blockDir.'/*.json');
$key = array_rand($files);
$filesToUse[] = $files[$key];
}
$mapContext = new ZMQContext();
$mapDispatch = new ZMQSocket($mapContext, ZMQ::SOCKET_PUSH);
$mapDispatch->bind("tcp://*:50558");
$mapDispatch->send(0);
foreach($filesToUse as $blockPath){
$mapDispatch->send($blockPath);
sleep(1);
}
$filesToUse
is an array of files submitted by users that contain information to be used in querying the game server. As you can see, I'm looping through the array and sending each file to the ZeroMQ
listener
file, which contains:
$startTime = time();
$context = new ZMQContext();
$receiver = new ZMQSocket($context, ZMQ::SOCKET_PULL);
$receiver->connect("tcp://*:50557");
$sender = new ZMQSocket($context, ZMQ::SOCKET_PUSH);
$sender->connect("tcp://*:50559");
while(true){
$file = $receiver->recv();
// -------------------------------------------------- do all work here
// ... ~ 75:00 [min] DATA PROCESSING SECTION foreach .recv()-ed WORK-UNIT
// ----------------------------------------------------------------------
$endTime = time();
$totalTime = $endTime - $startTime;
$sender->send('Processing of domain '.listener::$domain.' competed on '.date('M-j-y', $endTime).' in '.$totalTime.' seconds.');
}
Then, in the final listener
file:
$context = new ZMQContext();
$receiver = new ZMQSocket($context, ZMQ::SOCKET_PULL);
$receiver->bind("tcp://*:50559");
while(true){
$log = fopen($rootPath.'logs/sink_'.date('F-jS-Y_h-i-A').'.txt', 'a');
fwrite($log, $receiver->recv());
fclose($log);
}
When the worker script is run from cron
, I get no confirmation text in my log.
Q1)
is this the most efficient way to do what I'm trying to?
Q2)
am I trying to use or implement ZeroMQ
incorrectly, here?
And, as it would seem, using cron
to call 14 files simultaneously is causing the load to far exceed the allotment. I know I could probably just set the jobs to run at different times throughout the day, but if at all possible, I would like to keep all updates on the same schedule.
I have since gone ahead and upgraded my VPS to 2 CPU cores, so the load aspect of the question isn't really all that relevant anymore.
The code above has also been changed to the current setup.
I am, after the code-update, getting an email from cron
now with the error:
Fatal error: Uncaught exception
'ZMQSocketException' with message 'Failed to bind the ZMQ: Address already in use'
Running your scripts through cron or through ZeroMQ will make absolutely no difference in how much CPU you will need. The only difference between the two is that the cron job starts your script at intervals and the messaging queue will start your script based on some user action.
At the end of the day, you need more available threads to run your scripts. But before you go down that path, you may want to take a look at your scripts. Maybe there's a more efficient way of writing them so that they don't take as much resources? And have you looked at your CPU utilization rate? Most web hosting services have built-in metrics that you can pull up through their console. You might not be using as much resources as you think.
The fact that it will take you that much longer to run a file that loops through all the servers than the cumulative time of running the files separately suggest that your scripts aren't being multi-threaded properly. A single instance of your script is not using up all available resources and thus you are only seeing speed gains when you run multiple instances of your scripts.