trying to get all the content within the h2 (to get the title of the article) in the div id=firehoselist but the following code only returns the first result. Any ideas please
$crawler = new Crawler($content);
$crawler->filterXPath('//div[@id="firehoselist"]//*')->each(function (Crawler $node) use (&$results) {
$results[] = trim($node->filter('h2')->text());
});
content I'm trying to scrape is too messy to post here, but it is from the slashdot org website
//div[@id="firehoselist"]
is looking for every element which has the ID of firehoselist
and will only get the first result of this entry $node->filter('h2')->text()
.
What you need is to get every #firehoselist h2
of the parsed html:
$crawler->filterXPath('//div[@id="firehoselist"]//h2')->each(function (Crawler $node) use (&$results) {
$results[] = trim($node->text());
});