Search code examples
pythonpython-3.xhyperlinkscreen-scrapinggoogle-crawlers

Extracting links from website using Python, NOT IN HTML


I need to gather PDF-files from this page: http://www.anp.gov.br/?id=532.

I wonder how this is possible in Python when I cant find the links in the HTML source code. Before I have found the links to such files by using Beautifulsoup and pandas.

Thanks for all kind of answers!


Solution

  • It looks like all of the pdf links are in <a> tags so you can use BeautifulSoup to grab those links. If you need further advice I recommend you reference this discussion to see how to accomplish that task.

    enter image description here