Search code examples
phpxmlazurecurlgoogle-custom-search

Azure won't cURL a google site search xml


I'm running the XML to PHP install of paid Google site search.

https://code.google.com/p/google-csbe-example/downloads/detail?name=gss.php&can=2&q=

My initial implementation runs perfectly on a LAMP server, however I now have to run a PHP environment on the Windows Azure Platform.

It appears as though the $url variable is not being passed through cURL, as the $result varaible returns as NULL.

$url = 'https://www.google.com/cse?cx=' . $your_cx_number . '&client=google-csbe&output=xml_no_dtd&q=' . $q;
if(isset($start)){
    $url .= '&start=' . $start;
}

If i modify the value of $url to a different remote xml file, with a little adjustment to the output structure, I get the expected results.

I have tried several different troubleshooting steps including:

  • cURL: alternate xml feed renders
  • simplexml: alternate rss feed renders
  • permissions: site permissions aren't required google cse dashboard
  • alternate azure site: tested and failed
  • alternate LAMP hosted site: tested and success
  • alternate search setup: this had no effect
  • is the domain blocked to google: don't think so
  • url queries blocked: not sure if this is causing any issues

I'm stumped.

Any help would be greatly appreciated. Thanks!

Here's the full code (minus the cx number):

<?php 
//ini_set('display_startup_errors',1);
//ini_set('display_errors',1);
//error_reporting(-1);

$q = $_GET['q'];
$q = urlencode($q);

//WRAPPED IN A IF STATEMENT SO PROMOTED RESULTS SHOW UP ON THE FIRST PAGE
if(isset($_GET['start'])){
   $start = $_GET['start'];
}

$your_cx_number = 'enter_your_paid_google_site_search_cx_number';

$url = 'https://www.google.com/cse?cx=' . $your_cx_number . '&client=google-csbe&output=xml_no_dtd&q=' . $q;
if(isset($start)){
    $url .= '&start=' . $start;
}

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);// allow redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // return into a variable
curl_setopt($ch, CURLOPT_TIMEOUT, 30); // times out after 30s
curl_setopt($ch, CURLOPT_HTTPGET, true); // set POST method
//curl_setopt($ch, CURLOPT_POSTFIELDS, "postparam1=postvalue"); // add POST fields

//submit the xml request and get the response
$result = curl_exec($ch);
curl_close($ch);

//now parse the xml with 
$xml = simplexml_load_string($result);
$START = $xml->RES['SN'];
$END = $xml->RES['EN'];
$RESULTS = $xml->RES->M;

?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Search results</title>
</head>
<body>
<form action="search-test.php" id="searchform" >
    <input type="text" name="q" placeholder="Search..." <?php if(isset($_GET['q'])) { echo 'value="' . $_GET['q']  . '"' ; }?> id="search-text" size="25" autocomplete="off" />
    <input type="submit"  id="search-button" title="Search" value="" />
</form>
<p>The url of the <a href="<?php echo $url ?>">XML output</a></p>
    <?php
//extract the title, link and snippet for each result
if ($xml->RES->R) {
    foreach ($xml->RES->R as $item) {
        $title = $item->T;
        $link = $item->U;
        $snippet = $item->S;
        echo    '<h3><a href="' . $link . '">' . $title . '</a></h3>
                 <p><a href="' . $link . '">' . $title . '</a></p>
                 <p>' . $snippet . '</p>
                 <hr />';
    }
}
?>
</body>
</html>

Solution

  • cURL error reporting returned an error code : 60

    Curl error: 60 - SSL certificate problem: unable to get local issuer certificate

    A search for a similar error provided the solution HTTPS and SSL3_GET_SERVER_CERTIFICATE:certificate verify failed, CA is OK

    Add in the line:

    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
    

    The full cURL function is now:

    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL,$url);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
    curl_setopt($ch, CURLOPT_FAILONERROR, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);// allow redirects
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // return into a variable
    curl_setopt($ch, CURLOPT_TIMEOUT, 30); // times out after 30s
    curl_setopt($ch, CURLOPT_HTTPGET, true); // set POST method
    
    $result = curl_exec($ch);
    if(curl_errno($ch)){ 
            echo 'Curl error: ' . curl_errno($ch) . ' - ' .curl_error($ch); 
            $info = curl_getinfo($ch); 
            if (empty($info['http_code'])) { 
                die("No HTTP code was returned"); 
            } else { 
                    echo $info['http_code']; 
            } 
    } 
    curl_close($ch);