Search code examples
powershellhttpwebrequest

Get a file from a URL with PowerShell


I am receiving an email that contains a link to a website that immediately starts a download of a file. I am able to successfully get the email and the URL, and when I paste the URL into the browser it automatically starts a download. The web page is below:

the web page for the URL

Unfortunately, the file can only be sent in an .xls format, but my end goal is to convert it to a CSV.

I know that Invoke-WebRequest is supposed to do this, and my command for that is:

Invoke-WebRequest -Uri $ExcelLink -OutFile 'C:\Temp\FileName.xls'.

I have also tried the following:

(New-Object System.Net.WebClient).DownloadFile($ExcelLink,'C:\Temp\FileName.xls')

I have tried setting the export to be both .xls and .csv, and it appears I can only get the raw HTML code, instead of the file to download. In the screenshots below the left is exporting as .csv, and the right is .xls:

output from the web requests

I have done a decent amount of research already, and the most helpful link was this Stackoverflow post.

The link from the email does not contain the file name. I stripped out a good amount from the URL, but it looks something like this:

https://_______.com/f/a/vl6K...hRdg~~/AA...gA~/RgRnjCy...QAAAM-

I have tried adding the file name to the end of the URL and for some reason it just redirects to Google.

Does anybody know of a way to get only the file content that automatically starts downloading when entering the URL in a browser?


Solution

  • For anybody finding this in the future, I was able to find a solution to this. First, do an Invoke-WebRequest on the URL from the email. Then, look into the RawContent attribute. This may be different for each request, but for my specific request there was section of javascript where there was a variable defined as downloadUrl. Using that URL in another Invoke-WebRequest I was able to successfully download the file.

    Here is some sample code, which works for my specific website request. Hopefully this will help somebody troubleshoot in the future.

    # $ExcelLink is the URL included in the email, which opens the web page and prompts the automatic download
    $Request = Invoke-WebRequest -Uri $ExcelLink
    # The parentheses will grab the URL string variable in the first regex group
    $DownloadUrlRegex = "var downloadUrl = '(\S+)';"
    $Request.RawContent -match $DownloadUrlRegex | Out-Null
    $DownloadUrl = $Matches[1]
    Invoke-WebRequest -Uri $DownloadURL -OutFile $Destination