python pandas beautifulsoup select-options

Extraction of ids and options from select using BeautifulSoup and arranging them in Pandas dataframe

I have the following html code which I have extracted:

<select class="class1", id="id1">
    <option value="0">A1</option>
    <option value="1">A2</option>
    <option value="2">A3</option>
    <option value="3">A4</option>
    <option value="4">A5</option>
    <option value="5">A6</option>
</select>
.
.
.
<select class="class2", id="id2">
    <option value="0">B1</option>
    <option value="1">B2</option>
    <option value="2">B3</option>
</select>
.
.
<select class="class3", id="id3">
    <option value="0">C1</option>
    <option value="1">C2</option>
    <option value="2">C3</option>
    <option value="2">C4</option>
</select>

I need to extract the options and the corresponding ids of each select and arrange them into a Pandas dataframe, similar to this:

id	option
id1	A1
id1	A2
id1	A3
id2	B1
id2	B2
id2	B3
id3	C1
id3	C2
id3	C3
id3	C4

Solution

I recommend using BeautifulSoup for this.

from bs4 import BeautifulSoup
parser = BeautifulSoup(s)

d = {'id': [],'option': []}
for s in parser.find_all('select'):
    for o in s.find_all('option'):
        d['id'].append(s['id'])
        d['option'].append(o.text)
df = pd.DataFrame(d)

Output:

>>> df
     id option
0   id1     A1
1   id1     A2
2   id1     A3
3   id1     A4
4   id1     A5
5   id1     A6
6   id2     B1
7   id2     B2
8   id2     B3
9   id3     C1
10  id3     C2
11  id3     C3
12  id3     C4