I am trying to understand how I can return several dictionaries from a function. If I print out data_dict in the function itself, I get five dictionaries. If data_dict will be returned from the function, stored in a variable and then printed out, only the last dictionary will be shown. How can all five dictionaries be returned?
import requests
from bs4 import BeautifulSoup
import re
import json
source = requests.get('https://www.tripadvisor.ch/Hotel_Review-g188113-d228146-Reviews-Coronado_Hotel-Zurich.html#REVIEWS').text
soup = BeautifulSoup(source, 'lxml')
pattern = re.compile(r'window.__WEB_CONTEXT__={pageManifest:(\{.*\})};')
script = soup.find("script", text=pattern)
dictData = pattern.search(script.text).group(1)
jsonData = json.loads(dictData)
def get_reviews():
data_dict = {}
for locations in jsonData['urqlCache']['669061039']['data']['locations']:
for data in locations['reviewListPage']['reviews']:
data_dict['reviewid'] = data['id']
data_dict['authoridtripadvisor'] = data['userId']
userProfile = data['userProfile']
data_dict['author'] = userProfile['displayName']
print(data_dict)
#return data_dict
reviews = get_reviews()
print(reviews)
Thank you for all suggestions!
Your problem is that in data_dict
you can keep only one dictionary.
You have to create list for all dictionares
all_dictionaries = []
and append()
every dictionary to this list
all_dictionaries.append(data_dict)
and return
this list
return all_dictionaries
And inside for
-loop you have to create new dictionary for new data. You can't use one data_dict
and replace elements in this dictionary.
def get_reviews():
all_dictionaries = []
for locations in jsonData['urqlCache']['669061039']['data']['locations']:
for data in locations['reviewListPage']['reviews']:
data_dict = {}
data_dict['reviewid'] = data['id']
data_dict['authoridtripadvisor'] = data['userId']
userProfile = data['userProfile']
data_dict['author'] = userProfile['displayName']
print(data_dict)
all_dictionaries.append(data_dict)
return all_dictionaries