Search code examples
javascriptpythonweb-crawlerurlopen

Why missing content/values when using urlopen to crawl data?


I just use following Python codes to crawl data

html = urlopen("https://www.hkex.com.hk/?sc_lang=en").read().decode('utf-8') 
print(html)

But I missed the content and only got

<div class="type value"></div>

My goal is to get

<div class="type value">HK$24,225M</div>  or HK$24,225M

enter image description here


Solution

  • The data on this website is updated using JavaScript. Try to press Ctrl+U in your browser.

    In this example, the data is fetched from https://www.hkex.com.hk/eng/csm/script/data_NBSZ_Turnover_eng.js or data_SBSZ_Turnover_eng.js. (I don't know what you need)

    In the future look at the "Network" tab in the developer tools, you can probably find what you need there.