Search code examples
python-2.7beautifulsoupscraper

How can I extract text from a span tag using beautiful soup 4?


how can I scrape text through span tags using beautful soup? scrape faculty members informations

from bs4 import BeautifulSoup
import requests
r = requests.get("http://www.uoj.ac.ae/ContentBan.aspx?m=15&p=4&sm=4")
soup = BeautifulSoup(r.content, 'html5lib')
for tag in soup.find_all('table'):
    if tag.has_attr("class"):
        if tag['class'] == 'MsoTableGrid':
            for tag1 in soup.findAll('span'):
                print tag1.text

I want to print the text inside span tags, but the output i get is:

 Process finished with exit code 0

Solution

  • You can find tr elements of table where class MsoTableGrid using CSS selectors, and then get needed information, say, faculty name and email address, from columns of the row, for example :

    >>> rows = soup.select("table.MsoTableGrid tr")
    >>> for r in rows:
    ...     faculty_info = r.find_all("td")[1:3]
    ...     if len(faculty_info) == 2:
    ...         print faculty_info[0].text.strip(), faculty_info[1].text.strip()
    ... 
    Name E-mail
    Dr. Hassan Ali Dabouq [email protected]
    Prof.dr.Magdie   Medhat Elnahry [email protected]
    Dr. Abd   Elwahaab Mohamed Khalil [email protected]
    Dr.   Ahmed Hassan Fouly [email protected]
    Dr.   Walid Mohamed Abbas [email protected]
    Dr. Wael   Mahmoud Fakhry [email protected]
    Dr.   Kamel Abd Elaziz Ali [email protected]
    .
    .
    .