I have this simple code to get the title of any page
<?php
$doc = new DOMDocument();
@$doc->loadHTMLFile('http://www.facebook.com');
$xpath = new DOMXPath($doc);
echo $xpath->query('//title')->item(0)->nodeValue."\n";
?>
It is working fine on all pages that I have tried but not in Facebook.
When I try in Facebook it is not showing Welcome to Facebook - Log In, Sign Up or Learn More
, but it is showing Update Your Browser | Facebook
.
I think there is a problem with useragent. So is there a way to change the useragent or is there any other solution for this?
You can set the user agent in php.ini, without the need for curl. Just use the below lines before you load the DOMDocument
$agent = "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)";
ini_set('user_agent', $agent);
And then your code:
$doc = new DOMDocument();
@$doc->loadHTMLFile('http://www.facebook.com');
$xpath = new DOMXPath($doc);
echo $xpath->query('//title')->item(0)->nodeValue."\n";