I'm trying to get a specific webpage using php file_get_contents - when I view the page directly there is no problem but when trying to grab it using php I get "failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden". Theres a piece of data that I'm trying to extract from the page.
$ft = file_get_contents('https://www.vesselfinder.com/vessels/CELEBRITY-MILLENNIUM-IMO-9189419-MMSI-249055000');
echo $ft;
I've read up on various pages here about using stream_context_create, mainly the user agent part
$context = stream_context_create(
array(
"http" => array(
"header" => "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36"
)
)
);
But nothing works and I now get a 400 error message. Unfortunately it doesn't look like my server is configured to use cURL so file_get_contents seems to be the only way for me to do this.
You need to add the User-Agent
header to the actual header:
$context = stream_context_create(
array(
'http' => array(
'header' => 'User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36',
),
));
You could also use the user_agent
option:
$context = stream_context_create(
array(
'http' => array(
'user_agent' => 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36',
),
));
Both above examples should work and you should now be able to get the contents using:
$content = file_get_contents('https://www.vesselfinder.com/vessels/CELEBRITY-MILLENNIUM-IMO-9189419-MMSI-249055000', false, $context);
echo $content;
This could of course also be tested using curl from the command line. Notice that we are setting our own User-Agent
header:
curl --verbose -H 'User-Agent: YourApplication/1.0' 'https://www.vesselfinder.com/vessels/CELEBRITY-MILLENNIUM-IMO-9189419-MMSI-249055000'
It might also be worth knowing that the default User-Agent
used by curl seems to be blocked, so if using curl you need to add your own using the -H
flag.