The idea is to collect all soundcloud users' id's (not names) who posted tracks that first letter is e.g. "f" in the period in our case of "past year".
I used filters on soundcloud and got results in the next URL:
I found the first user's id ("wavey-hefner") in the follow line of html code:
<a class="sound__coverArt" href="/wavey-hefner/foreign" draggable="true">
I want to get every user's id from the whole html.
My code is:
import requests
import re
from bs4 import BeautifulSoup
html = requests.get(" filter.created_at=last_year&filter.genre_or_tag=hip-hop%20%26%20rap")
soup = BeautifulSoup(html.text, 'html.parser')
for id in soup.findAll("a", {"class" : "sound_coverArt"}):
print (id.get('href'))
It returns nothing :(
The page is rendered in JavaScript. You can use Selenium to render it, first install Selenium:
pip3 install selenium
Then get a driver e.g. (if you are on Windows or Mac you can get a headless version of Chrome - Canary if you like) put the driver in your path.
from bs4 import BeautifulSoup
from selenium import webdriver
import time
browser = webdriver.Chrome()
url = (' filter.created_at=last_year&filter.genre_or_tag=hip-hop%20%26%20rap')
# To make it load more scroll to the bottom of the page (repeat if you want to)
browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
html_source = browser.page_source
soup = BeautifulSoup(html_source, 'html.parser')
for id in soup.findAll("a", {"class" : "sound__coverArt"}):
print (id.get('href'))