I tried to crawl this page: http://hea.uum.edu.my/index.php/academic/current-student/convocation
Here is my code
<?php
require_once 'vendor/autoload.php';
use Goutte\Client;
$client = new Client();
$crawler = $client->request('GET', 'http://hea.uum.edu.my/index.php/academic/current-student/convocation');
$step = array();
$i = 0;
$crawler->filter('.sppb-addon.sppb-addon-accordion')->each(function ($node) {
global $step, $i;
$step[$i]['item'] = array();
$node->filter('.sppb-addon-title')->each(function ($node) {
global $step, $i;
$step[$i]['cat'] = $node->html();
});
$j = 0;
$node->filter('.sppb-panel-heading > .sppb-panel-title')->each(function ($node) {
global $step, $i, $j;
$step[$i]['item'][$j++]['title'] = $node->html();
});
$h = 0;
$node->filter('.sppb-panel-body .sppb-addon-content')->each(function ($node) {
global $step, $i, $h;
$step[$i]['item'][$h++]['content'] = $node->html();
});
$i++;
});
print_r($step);
It is almost perfect except for the fact that the first element for item doesn't have number and the numbering does not reset when in new array.
Array
(
[0] => Array
(
[item] => Array
(
[] => Array //here no number
(
[title] => STEP 1 : ...
[content] => <p>If you are eligible to graduate...
...
[1] => Array
(
[item] => Array
(
[13] => Array //here the number should be 0
(
[title] => STEP 14 : CONVOCATION DRESS ..
[content] => <p>Here are the official...
You can see the result here: view-source:http://convo18.uum.my/
Please help. And I interested to know if you have any elegant solution for this situation, on top of solving my problem.
Thanks for your time.
=========================================================================
UPDATE: Thanks to @NigelRen for the suggestion, here is the code that works:
<?php
require_once 'vendor/autoload.php';
use Goutte\Client;
$client = new Client();
$crawler = $client->request('GET', 'http://hea.uum.edu.my/index.php/academic/current-student/convocation');
$step = array();
$i = 0;
$crawler->filter('.sppb-addon.sppb-addon-accordion')->each(function ($node) use (&$step, &$i) {
$step[$i]['item'] = array();
$node->filter('.sppb-addon-title')->each(function ($node) use (&$step, &$i) {
$step[$i]['cat'] = $node->html();
});
$h = 0;
$node->filter('.sppb-panel-heading > .sppb-panel-title')->each(function ($node) use (&$step, &$i, &$h) {
$step[$i]['item'][$h++]['title'] = $node->html();
});
$h = 0;
$node->filter('.sppb-panel-body .sppb-addon-content')->each(function ($node) use (&$step, &$i, &$h) {
$step[$i]['item'][$h++]['content'] = $node->html();
});
$i++;
});
print_r($step);
Just tested a dummy setup and I think the solution is to define $j
and $h
outside any nested function. The reason being that they are not defined at global scope so when you say global $step, $i, $j;
and then $j++
, this will take it as undefined the first time and then the post increment will set it to 1. The test code to show this is...
$a = function() {
global $c;
echo "Value=";
echo $c++;
echo PHP_EOL;
};
$a();
$a();
outputs...
Value=
Value=1
Whereas...
$c=0;
$a = function() {
global $c;
echo "Value=";
echo $c++;
echo PHP_EOL;
};
$a();
$a();
Gives the desired output...
Value=0
Value=1
So define all of these at the start...
$i = 0;
$j = 0;
$h = 0;
Edit:
Although as per my original comment, global
is generally frowned on, it makes testing more difficult and also (as found) may not work as you expect. The suggested method is to use the function(...) use(...) {
method format, so in the example...
$c=0;
$a = function() use (&$c) {
echo "Value=";
echo $c++;
echo PHP_EOL;
};
$a();
$a();