Search code examples
python-2.7beautifulsoupscraper

How can I extract text from a span tag using beautiful soup 4?


how can I scrape text through span tags using beautful soup? scrape faculty members informations

from bs4 import BeautifulSoup
import requests
r = requests.get("http://www.uoj.ac.ae/ContentBan.aspx?m=15&p=4&sm=4")
soup = BeautifulSoup(r.content, 'html5lib')
for tag in soup.find_all('table'):
    if tag.has_attr("class"):
        if tag['class'] == 'MsoTableGrid':
            for tag1 in soup.findAll('span'):
                print tag1.text

I want to print the text inside span tags, but the output i get is:

 Process finished with exit code 0

Solution

  • You can find tr elements of table where class MsoTableGrid using CSS selectors, and then get needed information, say, faculty name and email address, from columns of the row, for example :

    >>> rows = soup.select("table.MsoTableGrid tr")
    >>> for r in rows:
    ...     faculty_info = r.find_all("td")[1:3]
    ...     if len(faculty_info) == 2:
    ...         print faculty_info[0].text.strip(), faculty_info[1].text.strip()
    ... 
    Name E-mail
    Dr. Hassan Ali Dabouq dr.hassandbouk@uoj.ac.ae
    Prof.dr.Magdie   Medhat Elnahry magdielnahry@uoj.ac.ae
    Dr. Abd   Elwahaab Mohamed Khalil abdelwahab@uoj.ac.ae
    Dr.   Ahmed Hassan Fouly Dr.ahmedfoly@uoj.ac.ae
    Dr.   Walid Mohamed Abbas walidabas@uoj.ac.ae
    Dr. Wael   Mahmoud Fakhry wfakhry@uoj.ac.ae
    Dr.   Kamel Abd Elaziz Ali kamelali@uoj.ac.ae
    .
    .
    .