Search code examples
pythonweb-scrapingbeautifulsouphtml-parsing

Getting all Links from a page Beautiful Soup


I am using beautifulsoup to get all the links from a page. My code is:

import requests
from bs4 import BeautifulSoup


url = 'http://www.acontecaeventos.com.br/marketing-promocional-sao-paulo'
r = requests.get(url)
html_content = r.text
soup = BeautifulSoup(html_content, 'lxml')

soup.find_all('href')

All that I get is:

[]

How can I get a list of all the href links on that page?


Solution

  • You are telling the find_all method to find href tags, not attributes.

    You need to find the <a> tags, they're used to represent link elements.

    links = soup.find_all('a')
    

    Later you can access their href attributes like this:

    link = links[0]          # get the first link in the entire page
    url  = link['href']      # get value of the href attribute
    url  = link.get('href')  # or like this