Search code examples
pythonpython-requestshtmlsession

Cannot get full text from url using requests_html


I'm trying to parse a page from this url: https://www.mathworks.com/help/radar/referencelist.html?type=block

I need to get all the links from the list of blocks under "Radar Toolbox — Blocks" header i.e. inside <div id="reflist_content">.

I'm using requests_html like this:

from requests_html import HTMLSession

session = HTMLSession()

url = 'https://www.mathworks.com/help/radar/referencelist.html?type=block'
r = session.get(url)

r.html.arender()

results = r.html.find('div')
res_str = ''

for item in results:
    #print(item)
    #print(item.text)
    res_str += str(item) + '\n'
    res_str += item.text + '\n\n'

The text of reflist_content in the results is empty.

I cannot find any of the needed content in the results. I tried to search by different html tags or keywords but it seems that the table with blocks is not rendered at all. What am I doing wrong?


Solution

  • The data you see is loaded from external URL. TO load it you can use this example:

    import requests
    from bs4 import BeautifulSoup
    
    url = "https://www.mathworks.com/help/radar/referencelist_block_cat.xml"
    
    soup = BeautifulSoup(requests.get(url).content, "xml")
    
    for category in soup.select("cat:has(ref)"):
        print(category.title.text)
        print()
        for r in category.select("ref"):
            print(
                f'{r.text[:40]:<40} {"https://www.mathworks.com/help/radar/" + r["target"]}'
            )
        print()
    

    Prints:

    Radar Toolbox
    
    Backscatter signals from bicyclist (Sinc https://www.mathworks.com/help/radar/ref/backscatterbicyclistblock.html
    Backscatter signals from pedestrian (Sin https://www.mathworks.com/help/radar/ref/backscatterpedestrianblock.html
    Barrage jammer interference source (Sinc https://www.mathworks.com/help/radar/ref/barragejammer.html
    Constant gamma clutter simulation (Since https://www.mathworks.com/help/radar/ref/constantgammaclutter.html
    Constant gamma clutter simulation using  https://www.mathworks.com/help/radar/ref/gpuconstantgammaclutter.html
    Generate radar sensor detections and tra https://www.mathworks.com/help/radar/ref/radardatagenerator.html
    Combine detection reports from different https://www.mathworks.com/help/radar/ref/detectionconcatenation.html
    Two-ray channel environment (Since R2021 https://www.mathworks.com/help/radar/ref/tworaychannel.html
    Wideband two-ray channel environment (Si https://www.mathworks.com/help/radar/ref/widebandtworaychannel.html
    Library of pulse waveforms (Since R2021a https://www.mathworks.com/help/radar/ref/pulsewaveformlibrary.html
    Library of pulse compression specificati https://www.mathworks.com/help/radar/ref/pulsecompressionlibrary.html
    Cluster detections (Since R2021a)        https://www.mathworks.com/help/radar/ref/dbscanclusterer.html
    
    Data Synthesis
    
    Backscatter signals from bicyclist (Sinc https://www.mathworks.com/help/radar/ref/backscatterbicyclistblock.html
    Backscatter signals from pedestrian (Sin https://www.mathworks.com/help/radar/ref/backscatterpedestrianblock.html
    Barrage jammer interference source (Sinc https://www.mathworks.com/help/radar/ref/barragejammer.html
    Constant gamma clutter simulation (Since https://www.mathworks.com/help/radar/ref/constantgammaclutter.html
    Constant gamma clutter simulation using  https://www.mathworks.com/help/radar/ref/gpuconstantgammaclutter.html
    Generate radar sensor detections and tra https://www.mathworks.com/help/radar/ref/radardatagenerator.html
    Combine detection reports from different https://www.mathworks.com/help/radar/ref/detectionconcatenation.html
    Two-ray channel environment (Since R2021 https://www.mathworks.com/help/radar/ref/tworaychannel.html
    Wideband two-ray channel environment (Si https://www.mathworks.com/help/radar/ref/widebandtworaychannel.html
    Library of pulse waveforms (Since R2021a https://www.mathworks.com/help/radar/ref/pulsewaveformlibrary.html
    
    Signal and Data Processing
    
    Library of pulse compression specificati https://www.mathworks.com/help/radar/ref/pulsecompressionlibrary.html
    Cluster detections (Since R2021a)        https://www.mathworks.com/help/radar/ref/dbscanclusterer.html
    
    Detection, Range, Angle, and Doppler Estimation
    
    Library of pulse compression specificati https://www.mathworks.com/help/radar/ref/pulsecompressionlibrary.html
    
    Clustering
    
    Cluster detections (Since R2021a)        https://www.mathworks.com/help/radar/ref/dbscanclusterer.html