I'm using fabpot/goutte 3.2, trying this code to access a website and isn't working
$client = new \Goutte\Client();
$guzzleClient = new \GuzzleHttp\Client(array(
'curl' => array(
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_SSL_VERIFYHOST => false,
CURLOPT_SSL_VERIFYPEER => false),
));
$client->setClient($guzzleClient);
$crawler = $client->request('GET', "www.superpharm.pl/sklepy");
$crawler->filter('body')->each(function ($node) {
print $node->text() . "\n";
});
Getting this error:
In CurlFactory.php line 186:
[GuzzleHttp\Exception\ConnectException]
cURL error 7: Failed to connect to localhost port 80: Connection refused (s
ee http://curl.haxx.se/libcurl/c/libcurl-errors.html)
This is working:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "www.superpharm.pl/sklepy");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$html = curl_exec($ch);
echo $html;
This is working too (without goutte client):
$client = new \GuzzleHttp\Client();
$res = $client->request('GET', 'www.superpharm.pl/sklepy', ['verify' => false]);
echo $res->getBody();
Anyone knows why isn't working with goutte?
The client used by Goutte first attempts to get the absolute URI based on the $uri argument. Because you have omitted the scheme from your URI (i.e. https://
) the client transforms it to this:
http://localhost/www.superpharm.pl/sklepy
The solution is to simply change your URI to include the scheme like so:
$crawler = $client->request('GET', "https://www.superpharm.pl/sklepy");