Search code examples
pythonwebbeautifulsoup

Python, Web Scrapping


I tried scrapping a website using BeautifulSoup and no matter the method or selector I try it always returns an empty list. This was supposed to print the top 1001 songs on the billboard chart

from bs4 import BeautifulSoup
import requests

date = input("Which year do you want to travel to? Type the date in this format YYYY-MM-DD: ")

response = requests.get("https://www.billboard.com/charts/hot-100/" + date)

soup = BeautifulSoup(response.text, 'html.parser')
song_names_spans = soup.find_all("span", class_="chart-element__information__song")
song_names = [song.getText() for song in song_names_spans]

Solution

  • It looks like you have the wrong .find_all() call. Try using a .select() to use a CSS selector and call instead and copy-paste the list of classes that song titles have from the developer tools in your browser (I chose the first four: c-title, a-no-trucate, a-font-primary-bold-s, and u-letter-spacing-0021, and it worked). Like this:

    from bs4 import BeautifulSoup
    import requests
    
    date = input("Which year do you want to travel to? Type the date in this format YYYY-MM-DD: ")
    
    response = requests.get("https://www.billboard.com/charts/hot-100/" + date)
    
    soup = BeautifulSoup(response.text, 'html.parser')
    song_names_els = soup.select('h3.c-title.a-no-trucate.a-font-primary-bold-s.u-letter-spacing-0021')
    song_names = [song.getText().strip() for song in song_names_els]
    
    print(song_names)
    

    Note that all song titles are <h1> tags, not <span>s, so you should search for <h1>s instead.