Search code examples
pythonscreen-scrapingbeautifulsoup

Scraping value from field on a page


I'm scraping a page that has a link like this:

<a id="something" href="place" class="thing" data="12345">
<span class="otherthing"></span></a>

I'd like to extract the number in the field called data. I've been trying to use BeautifulSoup like this:

soup = BeautifulSoup(response)
for a in soup.findAll('a'):
        if 'data' in a['a']:
                print a['a']['data']

But I'm getting a key error.


Solution

  • maybe this is what you need:

    for a in soup.findAll('a'):
        if a.has_attr('data'):
            print(a['data'])