Search code examples
pythonselenium-webdriverselenium-edgedriver

what do I do if my code is printing only one dictionary when i have several list after scraping a website using selenium


This is my code

from selenium import webdriver
from selenium.webdriver.common.by import By

url ='https://open.spotify.com/playlist/37i9dQZF1DXbTop77dnX35'
driver = webdriver.Edge()
driver.get(url)

songs = []

musics = driver.find_elements(By.CLASS_NAME, "IjYxRc5luMiDPhKhZVUH.UpiE7J6vPrJIa59qxts4")

for music in musics:
    title = music.find_element(By.CLASS_NAME, "btE2c3IKaOXZ4VNAb8WQ").text
    album = music.find_element(By.CLASS_NAME, "_TH6YAXEzJtzSxhkGSqu").text
    artist = music.find_element(By.CLASS_NAME, "Text__TextElement-sc-if376j-0.gYdBJW.encore-text-             body-small").text
    duration = music.find_element(By.CLASS_NAME, "PAqIqZXvse_3h6sDVxU0").text
    print(title, album, artist, duration)

new_music = {
    'title' : title,
    'album' : album,
    'artist' : artist,
    'duration' : duration
}

songs.append(new_music)
print(songs)

And this is my result

{'title': 'Let It Flow', 'album': 'Let It Flow', 'artist': 'Maya Amolo', 'duration': '2:58'}

I want to create a dictionary for all the result i got but its printing just the last one


Solution

  • Indent is important in python. It tells the compiler what is and is not in scope. You need to indent in from new_music = { down to the second to last line. print(songs) should not be indented so that it runs at the end.

    Also, you are not using By.CLASS_NAME correctly. It expects a single class name and in some cases you are sending it a partial CSS selector. It works but it's purely by accident. By.CLASS_NAME passes the info to a CSS selector behind the scenes so that's why your partial CSS selectors are working. Instead change them to use By.CSS_SELECTOR.

    The fixed code is below

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    
    url ='https://open.spotify.com/playlist/37i9dQZF1DXbTop77dnX35'
    driver = webdriver.Chrome()
    driver.get(url)
    
    songs = driver.find_elements(By.CSS_SELECTOR, ".IjYxRc5luMiDPhKhZVUH.UpiE7J6vPrJIa59qxts4")
    for song in songs:
        title = song.find_element(By.CLASS_NAME, "btE2c3IKaOXZ4VNAb8WQ").text
        album = song.find_element(By.CLASS_NAME, "_TH6YAXEzJtzSxhkGSqu").text
        artist = song.find_element(By.CSS_SELECTOR, ".Text__TextElement-sc-if376j-0.gYdBJW.encore-text-body-small").text
        duration = song.find_element(By.CLASS_NAME, "PAqIqZXvse_3h6sDVxU0").text
        print(title, album, artist, duration)
    
        new_music = {
            'title' : title,
            'album' : album,
            'artist' : artist,
            'duration' : duration
        }
    
        songs.append(new_music)
    
    print(songs)
    

    and it prints

    [{'title': 'Adenuga (feat. Qing Madi)', 'album': 'Adenuga x Concerning', 'artist': 'Joeboy, Qing Madi', 'duration': '2:40'}, {'title': 'Wahala (feat. Olamide)', 'album': 'Wahala (feat. Olamide)', 'artist': 'CKay, Olamide', 'duration': '2:50'}, {'title': 'Fortnight (feat. Post Malone)', 'album': 'THE TORTURED POETS DEPARTMENT', 'artist': 'Taylor Swift, Post Malone', 'duration': '3:48'}, {'title': 'IN MY HEAD', 'album': 'IN MY HEAD', 'artist': 'Timaya, Tiwa Savage', 'duration': '2:57'}, ...