I have a huge list of blog URLs that I need to check the validity of. I've knocked together a script from this answer and from here.
Here is my script:
$siteURL = 'http://example.com/'
$File = '.\urls.txt'
$NewContent = Get-Content -Path $File | ForEach-Object {
$_
$HTTP_Request = [System.Net.WebRequest]::Create($siteURL + $_)
$HTTP_Response = $HTTP_Request.GetResponse()
$HTTP_Status = [int]$HTTP_Response.StatusCode
if ($HTTP_Status -eq 200) {
" - 200"
} else {
" - " + $HTTP_Status
}
$HTTP_Response.Close()
}
$NewContent | Out-File -FilePath $File -Encoding Default -Force
My issue is that when it gets to a 404 error it doesn't add this to the file and returns the following error in the console:
Exception calling "GetResponse" with "0" argument(s): "The remote server
returned an error: (404) Not Found."
At C:\Users\user.name\urlcheck.ps1:19 char:9
+ $HTTP_Response = $HTTP_Request.GetResponse()
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : WebException
Why am I getting this error?
Bonus question: my "200 - OK" responses are getting added to a new line, why?
In order to handle a 404
response (and similar error responses), we need a bit of error handling code:
ForEach-Object {
$_
$HTTP_Request = [System.Net.WebRequest]::Create($siteURL + $_)
try {
$HTTP_Response = $HTTP_Request.GetResponse()
}
catch [System.Net.WebException] {
# HTTP error, grab response from exception
$HTTP_Response = $_.Exception.Response
}
catch {
# Something else went horribly wrong, maybe abort?
}
$HTTP_Status = [int]$HTTP_Response.StatusCode
If ($HTTP_Status -eq 200) {
" - 200"
}
Else {
" - " + $HTTP_Status
}
$HTTP_Response.Close()
}
Bonus question: my 200 -OK responses are getting added to a new line, why?
That's because you output $_
and " - " + ...
in two separate statements. Remove the $_
from the top and combine it all in a single string:
ForEach-Object {
$HTTP_Request = [System.Net.WebRequest]::Create($siteURL + $_)
try {
$HTTP_Response = $HTTP_Request.GetResponse()
}
catch [System.Net.WebException] {
# HTTP error, grab response from exception
$HTTP_Response = $_.Exception.Response
}
catch {
# Something else went horribly wrong, maybe abort?
}
finally {
# Grab status code and dispose of response stream
$HTTP_Status = [int]$HTTP_Response.StatusCode
$HTTP_Response.Dispose()
}
"$_ - $HTTP_Status"
}