Search code examples
powershellhttppostwindows-10

How do I use a script to access text on a webpage, which is behind authentication?


I have a website that I can view information after a login. I need to capture something displayed to be used in a script.

Installing software is not an option - I have to do this with the tools that come with windows 10.

I tried Chrome's print-to-pdf feature, but this doesn't work with authentication. The printed page was just the login url, even though I logged in and navigated to view information I need.

Apparently, Powershell can use something called wscript to send keystrokes, to highlight the window, copy everything and dump it into a text file. I have no idea where to start with that, though.

I tried to use postman to build a query that would let me access that page. However, using the correct credentials reports:

anti forgery validation failed

When using postman, I noticed that when the login page is opened (before I log in) a cookie is downloaded. I checked in the developer tools in Firefox, and the login page provides this cookie, called __H2RequestVerification. When making the login request, the browser POSTs with the username, password, and this cookie (which is a long random string of letters and numbers).

I tried to do this in postman manually, but when I get to the part where credentials are supplied, I always get a "connection reset" error, even when supplying the token in the cookie.

Raw request from Postman, in curl format (this does not work):

curl --location 'https://data-demo.xxx.ac.uk/account/login?ReturnUrl=%2F' \
--header 'Host:  data-demo.xxx.ac.uk' \
--header 'User-Agent:  Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0' \
--header 'Accept:  text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8' \
--header 'Accept-Language:  en-GB,en;q=0.5' \
--header 'Accept-Encoding:  gzip, deflate, br' \
--header 'Content-Type:  application/x-www-form-urlencoded' \
--header 'Content-Length:  182' \
--header 'Origin:  https://data-demo.xxx.ac.uk' \
--header 'DNT:  1' \
--header 'Connection:  keep-alive' \
--header 'Referer:  https://data-demo.xxx.ac.uk/account/login?ReturnUrl=%2F' \
--header 'Cookie:  __H2RequestVerification=Wj3e8tH-8ikvaghOBS0k5x0Vd9X74CRhVRw5Ch9BgNwLIkfGYNI0Do9stFyI0B0yVoq6BQIeJZTGqApRs8Tb3tx0sMg1' \
--header 'Upgrade-Insecure-Requests:  1' \
--header 'Sec-Fetch-Dest:  document' \
--header 'Sec-Fetch-Mode:  navigate' \
--header 'Sec-Fetch-Site:  same-origin' \
--header 'Sec-Fetch-User:  ?1' \
--header 'Sec-GPC:  1' \
--header 'TE:  trailers' \
--form '__RequestVerificationToken="JtyADE1k-gov_-IYAGMh4urwLI0GK32wlltEZUPetV2TPSMpLE1vY7L8qBkn-Z9sWfcQl9vZfWukq04C55Oj9cFBRkU1"' \
--form 'EmailOrUsername="abc@123"' \
--form '.xxx="aPassWord"'

I don't know how to copy just the raw HTTP request from Firefox, though I presume there must be a way. To be clear, this is the way that works.

Here are the headers:

    Host: data-demo.xxx.ac.uk
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8
Accept-Language: en-GB,en;q=0.5
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
Content-Length: 182
Origin: https://data-demo.xxx.ac.uk
DNT: 1
Connection: keep-alive
Referer: https://data-demo.xxx.ac.uk/account/login
Cookie: __H2RequestVerification=Wj3e8tH-8ikvaghOBS0k5x0Vd9X74CRhVRw5Ch9BgNwLIkfGYNI0Do9stFyI0B0yVoq6BQIeJZTGqApRs8Tb3tx0sMg1
Upgrade-Insecure-Requests: 1
Sec-Fetch-Dest: document
Sec-Fetch-Mode: navigate
Sec-Fetch-Site: same-origin
Sec-Fetch-User: ?1
Sec-GPC: 1
TE: trailers

Here is the formdata:

__RequestVerificationToken  "u9tHCizsNnw0iZ4olHk5gt7gAqMCDEDrcQvZWM08TdT-U10NRfuEU2B8leZ4TU5Eq8UzE8YsfEemwvr8xCcHnVFJKnU1"
EmailOrUsername "123@abc"
Password    "aPassWord"

And the cookie:

__H2RequestVerification "Wj3e8tH-8ikvaghOBS0k5x0Vd9X74CRhVRw5Ch9BgNwLIkfGYNI0Do9stFyI0B0yVoq6BQIeJZTGqApRs8Tb3tx0sMg1"

Solution

  • You can indeed use Selenium, here's an idea :

    $ChromeOptions = New-Object OpenQA.Selenium.Chrome.ChromeOptions
    $ChromeOptions.addargument('--log-level=3') #Quiet mode
    $ChromeOptions.addargument('--kiosk-printing') #Enable automatically pressing the print button in print preview.
    $myMap = @{}
    $myMap.Add("default_directory", "$downloadpath") #set you default download path
    $ChromeOptions.AddUserProfilePreference("download", $myMap)
    $driver = New-Object OpenQA.Selenium.Chrome.ChromeDriver($ChromeOptions)
    
    Enter-SeUrl -url $your_url -driver $driver
    foreach ($cookie in $cookies) { Set-SeCookie -Name $cookie[0] -Value $cookie[1] -target $driver}
    Start-Sleep 10
    $driverbis.executescript("document.title='$nameyouwant'; window.print();")