Search code examples
pythonpython-2.7web-scrapingbeautifulsoupmechanize

parsing all options under select in BeautifulSoup


I have an HTML which has multiple select tags and multiple dropdown options under each select tag I want to parse all the options under each select and store them

This is how the html looks like

<select name="primary_select">
    <option></option>
    <option></option>
</select>
<select name="secondary_select">
    <option></option>
    <option></option>
</select>

This is how my code looks like

I am using beautifulsoup and mechanize in python

soup = BeautifulSoup(response.get_data())

 subject_options = soup.findAll('select', attrs = {'name': 'primary_select'} ).findAll("option")
print subject_options

I am getting the following error

AttributeError: 'ResultSet' object has no attribute 'findAll'

Thaks for helping :)


Solution

  • findAll returns a list in which you can't apply another findAll directly.

    from bs4 import BeautifulSoup
    html = '''<select name="primary_select">
        <option></option>
        <option></option>
    </select>
    <select name="secondary_select">
        <option></option>
        <option></option>
    </select>'''
    soup = BeautifulSoup(html)
    subject_options = [i.findAll('option') for i in soup.findAll('select', attrs = {'name': 'primary_select'} )]
    print subject_options
    

    Output:

    [[<option></option>, <option></option>]]
    

    Or

    Use css selectors.

    soup = BeautifulSoup(html)
    subject_options = soup.select('select[name=primary_select] > option')
    print subject_options
    

    I want to parse all the options under each select and store them.

    subject_options = soup.select('select > option')
    print subject_options
    

    output:

    [<option></option>, <option></option>, <option></option>, <option></option>]