Search code examples
phptordarknetonion-architecture

Can't get website meta description over tor network using PHP


Dear friends I'm trying make a interesting project that we can enter a tor .onion address and find the website title and meta description related to it. So here is my code,

<?php 

$ch = curl_init('http://torchdeedp3i2jigzjdmfpn5ttjhthh5wbmda2rr3jvqjg5p77c54dqd.onion');
curl_setopt_array($ch, [
    CURLOPT_RETURNTRANSFER => 1,
    CURLOPT_PROXYTYPE      => CURLPROXY_SOCKS5_HOSTNAME,
    CURLOPT_PROXY          => '127.0.0.1:9150',
    CURLOPT_HEADER         => 0,
    CURLOPT_FOLLOWLOCATION => 1,
    CURLOPT_ENCODING       => '',
    CURLOPT_COOKIEFILE     => '',
]);

$response = curl_exec($ch);

if ($response === false) {
    echo sprintf(
        "Request failed.  Error (%d) - %s\n",
        curl_errno($ch),
        curl_error($ch)
    );
    exit;
}

if (preg_match('/\<title\>(.*)\<\/title\>/i', $response, $match)) {
    echo "The title is '{$match[1]}'";
} else {
    echo "Did not find title in page.";
}

echo "<br></br><br>";

$tags = get_meta_tags($response);
echo $tags['description'];  // a php manual

 ?>

Actually i got the title of the website correctly. But my problem arise when i'm going to get the meta description of the .onion website. Here is a screenshotenter image description here

please help me. What is wrong with my php code


Solution

  • You may use DOMDocument to do the data parsing (I've tested in my server and it works):

    Please replace:

    $tags = get_meta_tags($response);
    echo $tags['description'];  // a php manual
    

    by

    
    
    //parsing begins here:
    $doc = new DOMDocument();
    @$doc->loadHTML($response);
    
    $metas = $doc->getElementsByTagName('meta');
    
    for ($i = 0; $i < $metas->length; $i++)
    {
        $meta = $metas->item($i);
        if(strtolower($meta->getAttribute('name')) == 'description')
            $description = $meta->getAttribute('content');
        if(strtolower($meta->getAttribute('name')) == 'keywords')
            $keywords = $meta->getAttribute('content');
    }
    
    echo "Description: $description". '<br/><br/>';
    echo "Keywords: $keywords";