I m using python to web scrape restaurant names from Foodpanda. The page's items are all rendered through their <script>
, so I cant get any data through their html css
foodpanda_url = "https://www.foodpanda.hk/restaurants/new?lat=22.33523782&lng=114.18249102&expedition=pickup&vertical=restaurants"
# send a request to the page, using the Mozilla 5.0 browser header
req = Request(foodpanda_url, headers={'User-Agent' : 'Mozilla/5.0'})
# open the page using our urlopen library
page = urlopen(req)
soup = BeautifulSoup(page.read(), "html.parser")
str_soup = str(soup.prettify())
I parse out the vendors string from str_soup using the following:
fp_vendors = list()
vendorlst = str_soup.split("\"discoMeta\":{\"reco_config\":{\"flags\":[]},\"traces\":[]},\"items\":")
opensqr = 0
startobj = 0
for i in range(len(vendorlst)):
if i==0:
for cnt in range(len(vendorlst[i])):
if (vendorlst[i][cnt] == '['):
opensqr += 1
elif (vendorlst[i][cnt] == ']'):
opensqr -= 1
if opensqr == 0:
vendorsStr = vendorlst[i][1:cnt]
opencurly = 0
for x in range(len(vendorsStr)):
if vendorsStr[x] == ',':
if (vendorsStr[x] == '{'):
opencurly += 1
elif (vendorsStr[x] == '}'):
opencurly -= 1
if opencurly == 0:
vendor = vendorsStr[startobj:x+1]
if (vendor not in fp_vendors) and vendor != "":
startobj = x+2 #continue to next {
for item in fp_vendors:
# print(item+"\n")
itemstr = re.split("\"minimum_pickup_time\":[0-9]+,\"name\":\"", item)[1]
itemstr = itemstr.split("\",")[0]
However, this only returns a small number of restaurants like approximately 50. How can I get the code to "get" more restaurant items from Foodpanda? How do I simulate the "scrolling down" of the page so more items are loaded so that I can get more restaurant items?
Using Your browser dev-tools You can easily monitor all requests that are made. For you particular case I found this api call:
Here is complete solution to your problem:
import json
import requests
items_list = []
url = "https://disco.deliveryhero.io/listing/api/v1/pandora/vendors?latitude=22.33523782&longitude=114.18249102&language_id=1&include=characteristics&dynamic_pricing=0&configuration=Variant1&country=hk&customer_id=&customer_hash=&budgets=&cuisine=&sort=&food_characteristic=&use_free_delivery_label=false&opening_type=pickup&vertical=restaurants&limit=48&offset={}&customer_type=regular"
for i in range(5):
resp = requests.get(
url.format(i * 48),
"x-disco-client-id": "web",
if resp.status_code == 200:
items_list += json.loads(resp.text)["data"]["items"]
print(f"Finished page: {i}")