Search code examples
pandasbeautifulsoupwikipedia

Scrape wikipedia table using BeautifulSoup


I would like to scrape the table titled "List of chemical elements" from the wikipedia link below and display it using pandas

https://en.wikipedia.org/wiki/List_of_chemical_elements

I am new to beautifulsoup and this is currently what i have.

from bs4 import BeautifulSoup
import requests as r
import pandas as pd

response = r.get('https://en.wikipedia.org/wiki/List_of_chemical_elements')

wiki_text = response.text

soup = BeautifulSoup(wiki_text, 'html.parser')

table_soup = soup.find_all('table')

Solution

  • You can select the table with beautifulsoup in different ways:

    1. By its "title":

      soup.select_one('table:-soup-contains("List of chemical elements")')
      
    2. By order in tree (it is the first one):

      soup.select_one('table')
      soup.select('table')[0]
      
    3. By its class (there is no id in your case):

      soup.select_one('table.wikitable')
      

    Or simply with pandas

    pd.read_html('https://en.wikipedia.org/wiki/List_of_chemical_elements')[0]
    

    *To get the expected result, try it yourself and if you have difficulties, ask a new question.