Search code examples
pythonpandasweb-scrapingbeautifulsoupwikipedia

Scrape from Div on Wikipedia Text from Links into List DataFrame BeautifulSoup using A Tags


I'm in the beginner stages of coding... Trying to scrape from wikipedia the text from song links from a div using "a" tags. However, I can only get the 1st song for each letter in the alphabet. I'm stripping the text instead of getting the title as some links are missing their title in the html. If anyone can help, thanks!

import requests
from bs4 import BeautifulSoup
import pandas as pd

url = 'https://en.wikipedia.org/wiki/Category:Song_recordings_produced_by_John_Lennon'

data = requests.get(url)
soup = BeautifulSoup(data.content, "html.parser")
div = soup.find"div", {"class":"mw-category mw-category-columns"})


songs = []

for song in div:
    songs.append(song.find_next("a").text.strip())

print(songs)

Output:

['Air Talk', "Baby's Heartbeat", 'Cambridge 1969', 'Dear John (John Lennon song)', 
'Every Man Has a Woman Who Loves Him', 'F Is Not a Dirty Word', 'Gimme Some Truth', 
'Happy Xmas (War Is Over)', "I Don't Wanna Be a Soldier", 'Jamrag (song)', 
'Kiss Kiss Kiss (Yoko Ono song)', 'Listen, the Snow Is Falling', 'Many Rivers to Cross', 
'New York City (John Lennon and Yoko Ono song)', "O'Wind (Body Is the Scar of Your Mind)", 
'Paper Shoes', 'Radio Play (song)', 'Scared (John Lennon song)', 'Telephone Piece', 
'Waiting for the Sunrise (song)', 'Yang Yang (song)']

Solution

  • You can use this example how to get all 182 songs to a list:

    import requests
    from bs4 import BeautifulSoup
    
    url = "https://en.wikipedia.org/wiki/Category:Song_recordings_produced_by_John_Lennon"
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    
    
    songs = [a.text for a in soup.select("#mw-pages li a")]
    
    print(*songs, sep="\n")
    print()
    print(f"Songs total={len(songs)}")
    

    Prints:

    
    ...
    
    Yellow Girl (Stand by for Life)
    Yes, I'm Your Angel
    You (Yoko Ono song)
    You Are Here (song)
    You're the One (Yoko Ono song)
    
    Songs total=182