I'm having a little problem which looks very simple... but I just don't get it! I try to download the website content of: http://cspsp.gshi.org/ (if you try to access it via www.cspsp.gshi.org you get to the wrong page....)
For this I do it like that in Powershell:
(New-Object System.Net.WebClient).DownloadFile( 'http://cspsp.gshi.org/', 'save.htm' )
I can acess the website with Firefox and download its contents easily but Powershell always outputs something like that:
The remoteserver returned an Error: (404) Nothing found.
(translated from German).
I'm not sure what I'm doing wrong here. Other websites like Google just work fine.
It appears that the site relies on the User-Agent
request headers being sent by HTTP clients, and that System.Net.WebClient
doesn't send even a default value (at least, it didn't when I hit my own, local servers.)
Either way, this worked for me:
$request = (New-Object System.Net.WebClient)
$request.headers['User-Agent'] = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.40 Safari/537.17"
$request.DownloadFile('http://cspsp.gshi.org/', 'saved.html')