Search code examples

BeautifulSoup: scraping a table by class attribute -- why don't I get any data?

I'm trying to scrape the ticker symbols located here using BeautifulSoup. Currently, I've tried the following:

import urllib
import BeautifulSoup
import re

url  = r''
html = urllib.urlopen(url).read()
soup = BeautifulSoup.BeautifulSoup(html)

table = soup.findAll('td', attrs = {'class': re.compile(r'\bticker left\b')})

This doesn't, however, give me anything. Can someone explain why I can't get all td tags with this class attribute? The html would lead one to think this would be possible, and relatively painless. For example:

<td class="ticker left">VUSXX              </td>

Thank you.


  • Continuing my above comment... you can use the following url which returns the required data (obtained from firefox extension Live HTTP Header)


    You could also use Selenium which uses Firefox Browser.

    1) Install Selneium IDE

    2) Install Selenium Python module

    Then u can use the following script.. which will run opens firefox browser.. and gets the results.

    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys
    import re
    from bs4 import BeautifulSoup #use bs4 from now on.
    browser = webdriver.Firefox()
    html = browser.page_source
    soup = BeautifulSoup(html)
    mydata = soup.find_all('tr')

    And, you can find what you want in mydata