Search code examples
powershellpdfencodingdecodinginvoke-restmethod

PowerShell - Issues Downloading Base64-Encoded PDF (Blank/White PDF)


Overview:

I am trying to make a script that downloads my internet service provider's (ISP) monthly statement. 'Cus, you know, who doesn't do that regularly? My ISP is Spectrum. I have the code that uses Invoke-RestMethod, and it gets the proper response. The response gives me a JSON object with base64-encoded pdf text.

Code:

$session = New-Object Microsoft.PowerShell.Commands.WebRequestSession
$session.Cookies.Add((New-Object System.Net.Cookie("thumbprint_consumer_portal", "xxxx", "/", ".spectrum.net")))
$pdf = Invoke-RestMethod -UseBasicParsing -Uri "https://apis.spectrum.net/selfservice/graph" `
-Method "POST" `
-WebSession $session `
-Headers @{
  "authorization"="Bearer xxxx"
} `
-ContentType "application/json" `
-Body "{$body}"

After getting the pdf code from the json object, I found that the code is Base64 encoded. "No problem!" I thought. So, I decode it, and encode it back into ASCII.

$raw = $pdf.data.viewer.account.statementPdf
$file = [System.Text.Encoding]::ASCII.GetString([System.Convert]::FromBase64String($raw))
Out-File -FilePath "bill.pdf" -InputObject $file

Problem:
The problem is when I try to decode the response, and then output it into a pdf file. The pdf is blank! But, it has the right number of pages, and the file size is the same amount as an original file I downloaded from the website. I am trying both Acrobat and Chrome PDF Viewer to view the PDFs.

Troubleshooting

  • I tried copying the original Base64 encoded text of the pdf (the raw text from the response) and put it in Cyberchef, and decoded it with Base64. There's an option to "Save output to file", which I do and the pdf comes as expected! It shows all my Internet details for the month, so, that works! But, I shouldn't have to open Cyberchef every time I want to do this. This tells me that there's something wrong with Powershell's decoding/encoding process.

  • I tried using difference encoding techniques within "[System.Text.Encoding]::": (UTF8, ASCII, Unicode, Default, Latin1). ASCII makes the file size what it should be, has the correct number of pages, but all the pages are still white and blank. UTF8, and "Default" encoding almost doubles the file size, the pages numbers are correct, and is still white and blank. Using "Latin1" does almost the same as UTF8 and Default, but makes a slightly smaller file size.

    • Using other encoding algorithms makes the file corrupt and pdf readers are unable to open the file.
  • I looked around online and found some suggestions:

    • "Use the '-Outfile' parameter in invoke-restmethod". I tried that, but the pdf file becomes unreadable and the file-size becomes bigger than the original.
    • I tried using a different "-ContentType" in the request. I tried "text/plain", but that results in the website thinking it's a CSRF Attack and block it. Same with "x-www-form-urlencoded". Using "application/pdf" gives a different error.

Expectations

I expect the pdf file to just open properly and show me the billings details, as if I had downloaded it from the website.

Any suggestions?


Solution

  • A PDF document is a binary file, not a text file - so trying to produce a string with [Encoding]::ASCII.GetString(...) won't do you any good.

    Instead, do exactly what you did with CyberChef - decode the base64 string and then output the resulting byte stream directly to disk:

    $fileContents = [System.Convert]::FromBase64String($raw)
    
    # for Windows PowerShell:
    Set-Content -Path path\to\file.pdf -Value $fileContents -Encoding Byte
    # for PowerShell >=7.x:
    Set-Content -Path path\to\file.pdf -Value $fileContents -AsByteStream