I know similar questions were asked before but no adaptation of other solutions yielded the desired result. Suppose a bs4 soup contains many elements like the one below:
<a class="employee background-white text-center col-xs-6 col-sm-4 col-md-3" data-cid="74" href="extract_this_link">
<div class="image" style="background-image: url(xxx.jpg) !important">
<div class="overlay flex center">
<div class="background">
</div>
</div>
</div>
<div class="bubble-description">
<p>
<b>
content1
</b>
<br/>
content2
</p>
</div>
</a>
<a class="hidden" href="link1">
</a>
<a class="hidden" href="link2">
</a>
<a class="hidden" href="link3">
</a>
How can I extract the link in the very first line (href="extract_this_link") for all elements in the soup and store them in a list?
Any help is greatly appreciated!
goal = [x['href'] for x in soup.select_one('.employee')]