Search code examples
pythonhtmlbeautifulsoupscreen-scraping

Scraping music playlist info from a website


I am trying to scrape the names of artists and songs from the online playlog of a a daily radio show I like. I'd like to eventually use that scraped data to use python to compile a playlist on Spotify or Youtube.

Why won't my code retrieve and print all the songs?

import urllib2
from bs4 import BeautifulSoup # latest version bs4

soup = BeautifulSoup(urllib2.urlopen("http://music.cbc.ca/#!/The-Signal").read(), 'lxml')

song = soup.find_all("span", {'class': 'logTrackTitle'})

print song

My code


Solution

  • Snooping around the page using the Chrome DevTools, you'll see that the 'Broadcast Log' section of the page is actually an iframe with a different URL. That's where the list of songs is "coming from".

    Swapping the iframe's URL into your code correctly returns the songs.

    import urllib2
    from bs4 import BeautifulSoup # latest version bs4
    
    soup = BeautifulSoup(urllib2.urlopen("http://music.cbc.ca/The-Signal").read(), 'lxml')
    
    song = soup.find_all("span", {'class': 'logTrackTitle'})
    
    print song