Search code examples
phpsimple-html-domconnection-timeout

"failed to open stream: Connection timed out" even though the remote site is up and running


I am testing a PHP script to scrape a remote site using the Simple HTML DOM Parser library. The code used to work fine; however, it suddenly stopped today.

<?php

require_once 'backend/connector.php';
require_once 'table_access/simplehtmldom_1_5/simple_html_dom.php';
ini_set("display_errors", 1);
error_reporting(E_ALL);
echo file_get_html("http://www.google.com");

?>

The error it's giving is:

Warning: file_get_contents(http://www.google.com): failed to open stream: Connection timed out in /home/peppyoil/public_html/sandboxassets/engines/table_access/simplehtmldom_1_5/simple_html_dom.php on line 75

I don't understand why it's timing out repeatedly despite the remote site being very much available when accessed through the browser. I would understand it it said connection refused or something like that but what could possibly explain the timing-out?

I tried using cURL:

<?php

$ch = curl_init();
 curl_setopt($ch, CURLOPT_URL, 'http://www.google.com');
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($ch, CURLOPT_PROXY, $proxy); // $proxy is ip of proxy server
 curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
 curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
 curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
 curl_setopt($ch, CURLOPT_TIMEOUT, 10);
 $httpCode = curl_getinfo($ch , CURLINFO_HTTP_CODE); // this results 0 every time
 $response = curl_exec($ch);
 if ($response === false) $response = curl_error($ch);
 echo stripslashes($response);
 curl_close($ch);

 ?>

Didn't work this time either threw the following error instead:

Connectiontimed out after 10001 milliseconds

The test script given above is sitting at http://www.peppyburro.com/sandboxassets/engines/test1.php

Update 2: Just checked my port 80 and found this:

Outbound Port 80, 443, 587 and 465 for your account are BLOCKED Reason for the port block: During our regular scans, we have found malicious files in your account which may be infected with malware.

Could this have anything to do with the timeouts?


Solution

  • Outbound Port 80, 443, 587 and 465 for your account are BLOCKED Reason for the port block: During our regular scans, we have found malicious files in your account which may be infected with malware.

    It is already stated above, that your hosting provider has found your site's content as malicious.

    It is because what you are trying to achieve is similar to a proxy server and comes under the section of URL rewriting sites. therefore you cannot host that script, cause it can be used to directly access that content which are blocked in your region but not in the region of your hosting provider.

    Hope this helps.