Search code examples
vbaxmlhttprequest

Unable to get the content of the document using XMLHTTP request (Part 2)


This is a follow-up question on my previous question, I was able to retrieve the content of the website with QHarr's help by .setRequestHeader "Cookie", "juLD4H3B=ABZHajF6AQAAH0KEfNV9kI1EEZg8m3BcrjBrBRN1ddwumUMKZVGciT2p_7ji" but this only lasted a day as I believe the cookie has expired.

I eventually found out that there was another request made to the website with additional Request Headers which will provide a response header with the cookie value if sent successfully.

I managed to figure out most of the required Request Headers as it is easily found in the first response:

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9    
Accept-Language: en-GB,en;q=0.9
True-Client-IP: 165.225.112.130
Upgrade-Insecure-Requests: 1
X-Cloud-Trace-Context: cfcc69068c5cb2d847890a7547b3e941/1772772094880168808
X-EC-Hot-Hash: 7790000207959645976
x-ec-pop: sgb
X-EC-Session-ID: 88079078809787886379151172106634033866
X-EC-Uuid: 1570108802375324103115733450970686183758
X-Forwarded-For: 103.252.200.88, 165.225.112.130, 152.195.199.174, 34.102.254.51
X-Forwarded-Proto: https
X-Host: www.businesstimes.com.sg
fToAPHTNF0-f: AwvHZFF6AQAAy-A_IruEaP1KJTiiaipDPoplNAurzgyEgKa0yDReQsaYWX4hAaXhcIKucsP1wH8AAEB3AAAAAA==

What I can't figure out and am having trouble with, are these Request Headers:

fToAPHTNF0-a: FcpvG3-0vr3aA8Wo3_e0pX7pDZl24EiY8Z_p81aALmAGp_UbCYMqQFZJC_EVsQByFUoAWUXFHtv2tPyBGEBpX6XDGGvxMW2otawK-FTcSV84AFh_9q_hA7AT7EPMYMzRay8xkbRZT5g0q8T9YQJMRH5S14aPsLHbP5Qdhb7xVNR0gTL9LE_WWDzsyHyNz3Nc9oKm0pgbcM3yGA7g7U-sCcrvNSa7ITbrO2Z62mEbf6XShFUIJcPY63Kq7FyDpz1rB2L4ItGrZA3Tkfz5e5DwoIK6MIh-y4e5ob5qYtBDhkfV7uBbI-TuvLpe8HC6FjSxdP_hlEPxfJvkMf8sXSgrTaXXBwwRVBx5Yq3eBljwCjgNiLbVi6lesZVE3S0aj2Q3fDLTbyG79jys1awsPZ8jIPs9W0YSHUrKhi73umkOs3itvJkqnaw1Uf75IpTLnJ_n_ZGSp2u9pRZJBQUx2qZhhYm4tV6qnV8mkVUmg2D9FbECOH4RboTW9ON8A8lyvjoheZ5RuH-quwlGgXXqISTucrnGK2Tz7pqAC49yMH8qqc7EV7BHhjRhVp-eZFe6F7c72DrtXjjcm5fpLK-1F0MG08hZFbzthjrHTN8KvR2FcQ47rSF91izAQMGZ4rzIjGCuqPuZkdIjPLjq9tUA9KRkOs5YxSt6RalUqIGouBsYvcUJaHGJSJhzPowSVTs8mMUbY9wBZAB5G7Yn08JUHy4ZGf-Y-Fvnl0lcJr9v7yxmZSQSttEFqAT_prC3zoqzdeUuDOVWLqyUiC_oJKOA7_mcJzlMX8nnj--Iuq2Pij83rtbNDSvrXXCKi5UOCjrrV04XlFabt48MWPF0t8vrwHpM7_tE56P7IW3ZCYRPPpRHmMeJ72MwQooGtJnCJXq2Cq0itAB1GnodvyYpAhqtEzma49TB6NRSNN4U4JGiz787uaJg1pdavdOzdejbS1gh_7SDwxHo4JMhhOpEWKgCdzfTziYF0BeKshkSRJj3ejUq5cqEDg_MnqeEaWM_VBiYRtqXGK7nDNtDKPW1CV3NfX11kV9BeAXNakcJhYSh5Qk-kks0HBEmCU7uU4U8bvOThdIurVGFoDcPxZywmC3cwF0Kk_SM2dR3nuN1nMObGopLnGGIEzRh9uaIHFowYuSUYuuy0EdUjgYShYMhLSZLRCzf7dOFHndPOV-RXhG446hMDAGzLM6PIPBP18ugx4fE36l3wPvGK77Ki5eVjB8fK9l2wK1f820xUbCElL15cJNkfiQ9uicTW-QR5knEw5LEmHU92HePFUJh8qQmYAWmv9gU8eDrIJaoDlFDsgStH-erlNpiDcOxSCRVFBBq-gHcJaImucwSbvnxvvAmAGebThueOEzZAupc0P21W1Q2WijGPf6n2zqkG9BIhYEk0BhYm_1Jl2FlEOz1_EHRVHjoBycnXMFlHet6Wh_4MauDiKkM4FEehYDr-rSkyZUmRBphuIq
fToAPHTNF0-b: iyrw7f
fToAPHTNF0-c: AMDFYVF6AQAAbtw8T-EjslRuCNO9KkreSk7faXdYDWrgCCNd_bD_S_Jdp51-
fToAPHTNF0-d: AAaChAiBBKCMgUGASZAQgICQACKw_0vyXaedfv_____sbgLzAYpha0zTSuaEBn0oG8gz2gI    
fToAPHTNF0-z: q

For completeness, This link is the html document returned from the first response in the above sample.

My suspect is that it is within the minified script and there's no way for me to get the cookie without using a browser.

I appreciate all the help for this!


Solution

  • I tried using "POST" instead of "GET" and it worked for me. Here's a little bit of code that got the headers for each article. I didn't bother parsing the rest of the information that you might want.

    Dim XMLPage As New MSXML2.XMLHTTP60
    Dim HTMLDoc As New MSHTML.HTMLDocument
    Dim ArticleTitle As Variant
    
    XMLPage.Open "POST", "https://www.businesstimes.com.sg/keywords/singapore-parliament", False
    XMLPage.send
    
    HTMLDoc.body.innerHTML = XMLPage.responseText
    
    For Each article In HTMLDoc.getElementsByClassName("widget__title")
        Debug.Print article.innerText
    Next article
    

    If you need to include a cookie, I believe you can use the following code (placed between XMLPage.Open and XMLPage.Send). You will need to adjust the expiry date.

    XMLPage.setRequestHeader "Cookie", "NSC_JOlo3vprczwsrc0em1nifnbukr3oebt=ffffffff09a3792945525d5f4f58455e445a4a423660; Path=/; Secure; HttpOnly; Expires=Sat, 03 Jul 2021 02:42:31 GMT;"
    

    But I didn't need to include this to get the HTMLDoc though.