I have some html laid out like this
<div class="news-a">
<article>
<header>
<h2>
<a>destination 1</a>
</h2>
</header>
</article>
<article>
<header>
<h2>
<a>destination 2</a>
</h2>
</header>
</article>
<article>
<header>
<h2>
<a>destination 3</a>
</h2>
</header>
</article>
</div>
I am trying to use BeautifulSoup to return all of the destination names, so I have targeted the div name of "news-a" because I know there is only one of these on the site. I have my scraper code as so:
import requests
from bs4 import BeautifulSoup
page = requests.get('url')
soup = BeautifulSoup(page.content, 'html.parser')
destinations = soup.find(class_='news-a')
for destination in destinations.find_all('h2'):
print(destination.text)
But this only returns the first result of "destination 1" when used with the live url
How about this one. More concise with desired output:
import requests
from bs4 import BeautifulSoup
page = requests.get('http://www.travelindicator.com/destinations?page=1').text
soup = BeautifulSoup(page,"lxml")
for item in soup.select(".news-a h2 a"):
print(item.text)
Result:
Con Dao
Kuwait City
Funafuti
Saint Helier
Mount Kailash
Sunny Beach
Krakow
Azores
Alsace
Qaqortoq
Salt Lake City
Valkenburg
Daegu
Lviv
São Luís
Abidjan
Lampedusa
Lecce
Norfolk Island
Petra