Search code examples
pythonbeautifulsouphref

Python: find elements from BeautifulSoup


Using BeautifulSoup I'm not beeing able to extract all the elements that I need:

For example, from this part:

[<div class="item-info-container">
 <picture class="logo-branding">
 <a data-markup="listado::logo-agencia" href="/en/pro/remax-yes/" title="RE/MAX Yes">
 <img alt="RE/MAX Yes" src="https://st3.idealista.pt/4b/f5/86/remax-yes.gif"/>
 </a>
 </picture>
 <a aria-level="2" class="item-link" href="/en/imovel/31306786/" role="heading" title="T3 flat in rua das Janelas Verdes 128, tornejando para a Tv. Das Atafonas 1, 1, Prazeres, Estrela">T3 flat in rua das Janelas Verdes 128, tornejando para a Tv. Das Atafonas 1, 1, Prazeres, Estrela</a>
 <div class="price-row ">
 <span class="item-price h2-simulated">769,000<span class="txt-big">€</span></span>
 <span class="item-parking">Garage included</span>
 <span class="pricedown">
 <span class="pricedown_price">
 785,000 €
 </span>
 <span class="pricedown_icon icon-pricedown">2%</span>
 </span>
 </div>
 <span class="item-detail">T3 <small></small></span>
 <span class="item-detail">147 <small>m²</small></span>
 <span class="item-detail">2nd floor <small> with lift</small></span>
 <div class="item-description description">
 <p class="ellipsis ">
 T3 full of charm, comfort and tranquility
 Excellent T3 with 147 m2, inserted in a private condominium with garden, with 2 parking spaces...
 </p>
 <span class="listing-tags">
 Luxury
 </span>
 </div>
 <div class="item-toolbar">
 <span class="icon-phone item-not-clickable-phone">215552845</span>
 <a class="icon-phone phone-btn item-clickable-phone" href="tel:+351 215552845" target="_blank">
 <span>Call</span>
 </a>
 <button class="icon-chat email-btn action-email fake-anchor"><span>Contact</span></button>
 <button class=" favorite-btn action-fav fake-anchor" data-role="add" data-text-add="Save" data-text-remove="Favourite" title="Save">
 <i class="icon-heart" role="image"></i>
 <span>Save</span>
 </button>
 <button class="icon-delete trash-btn action-discard fake-anchor" data-role="add" data-text-remove="Discard" rel="nofollow" title="Discard">
 </button>
 </div>

I have already been able to import the PRICE, TIPOLOGY and CONTACT

price=all[0].find("span",{"class":"item-price h2-simulated"}).text
tipology=all[0].find("span",{"class":"item-detail"}).text
contact=all[0].find("span",{"class":"icon-phone item-not-clickable-phone"}).text

I still wasn't able to extract the: LINK and TITLE:

 <a aria-level="2" class="item-link" href="/en/imovel/31306786/" role="heading" title="T3 flat in rua das Janelas Verdes 128, tornejando para a Tv. Das Atafonas 1, 1, Prazeres, Estrela">T3 flat in rua das Janelas Verdes 128, tornejando para a Tv. Das Atafonas 1, 1, Prazeres, Estrela</a>

HOUSE AREA and HOUSE FLOOR because their equal to the tipology:

     <span class="item-detail">147 <small>m²</small></span>
     <span class="item-detail">2nd floor <small> with lift</small></span>

can someone help me with this


Solution

  • You can try as follows:

    link=all[0].find("a",class_="item-link")['href']
    
    title=link=all[0].find("a",class_="item-link").get_text()