Search code examples
pythonhref

Extracting a specific href from table


I'm trying to extract the "10-K" url and append it into a list from the following site:

https://www.sec.gov/Archives/edgar/data/320193/000091205701544436/0000912057-01-544436-index.htm

Picture 1 my code

So basically I'm trying to extract the first under the first that does not have as its sub category.

Am trying to create a loop to loop this code in multiple similar-like links, but guess I'm trying to resolve this issue first for now.

Any ideas?


Solution

  • Hope this answers your requirement.

    import requests
    from bs4 import BeautifulSoup
    
    URL = "https://www.sec.gov/Archives/edgar/data/320193/000091205701544436/0000912057-01-544436-index.htm"
    page = requests.get(URL)
    
    soup = BeautifulSoup(page.content, "html.parser")
    
    rows = soup.findAll("td")
    
    href_list = []
    for ele in rows:
        a_Tag = ele.findChildren("a")
        if a_Tag:
            href_list.append(a_Tag)
    
    print(href_list)