I'm working on a web scraping project and I'm attempting to extract a list of store URLs from the following page: https://maroof.sa/businesses. Here are the methods I've tried thus far, but without success:
Using BeautifulSoup and Requests in Python to parse the HTML, but I was unable to locate the correct tags/classes that contain the store URLs. Employing Selenium to wait for JavaScript to render the store links and then extract them, but the appropriate elements
You can try:
import json
import requests
url = "https://api.thiqah.sa/maroof/public/api/app/business/search?keyword=&businessTypeId=&businessTypeSubCategoryId=®ionId=&cityId=&certificationType=&sortBy=2&sortDirection=2&sorting=&skipCount=0&maxResultCount=10"
headers = {"apikey": "c1qesecmag8GSbxTHGRjfnMFBzAH7UAN"}
data = requests.get(url, headers=headers).json()
# print(json.dumps(data, indent=4))
for i in data["items"]:
print(i["nameAr"])
print(f"https://maroof.sa/businesses/details/{i['id']}")
print()
Prints:
متجر بياض
https://maroof.sa/businesses/details/229217
متجر مروه
https://maroof.sa/businesses/details/48066
المتسوقة مآثر
https://maroof.sa/businesses/details/168551
متجر أحمد للأنظمة الصوتية والألكترونيات
https://maroof.sa/businesses/details/25650
متعب للأرقام المميزة
https://maroof.sa/businesses/details/253838
متجر شوب تتش
https://maroof.sa/businesses/details/246531
مؤسسة ماجد عبدالله الشهراني للمقاولات
https://maroof.sa/businesses/details/244174
شركة إنجاز للخدمات
https://maroof.sa/businesses/details/276939
موقع عاملتي الرقمي
https://maroof.sa/businesses/details/261892
مندوبكم
https://maroof.sa/businesses/details/112807
EDIT: Screenshot: