When I convert lists to the dictionary, only one element is stored in the dictionary, and I don't know why. this is my code. I used BeautifulSoup in web scraping and store in lists. Finally, I want to clean them and store them in JSON file
data = []
qa_dict={}
q_text=[]
q_username=[]
q_time=[]
r_text=[]
for i in range(1,2):
print(i)
url='https://drakhosravi.com/faq/?cat=all&hpage='+str(i)
response1 = requests.get(url).content.decode()
soup = BeautifulSoup(response1,'html.parser')
for s in soup.select('span.hamyar-comment-person-name') :
if(s.text!='دکتر آرزو خسروی'):
q_username.append(s.text.strip())
for times in soup.select('div.comment-header'):
for s in times.select('span.hamyar-comment-date') :
q_time.append(s.text.strip())
question=[]
for comment in soup.select('div.comment-body'):
question.append(comment.text.strip())
q_text=question[0::2]
for r in soup.select('ol.faq-comment_replies'):
for head in r.select('li'):
for t in head.select('div.comment-body'):
r_text.append(t.text.strip())
for username,qtime,qtxt,rtxt in zip(q_username,q_time,q_text,r_text):
qa_dict= {'username':username,'question_time':qtime,'question_text':qtxt,'url':url,'respond_text':rtxt,'responder_profile_url':'https://drakhosravi.com/about-us'}
data.append(qa_dict)
with open('drakhosravi2.json', 'w+', encoding="utf-8") as handle:
json.dump(qa_dict, handle, indent=4, ensure_ascii=False)
I don't fully understand your question but you seem to want qa_dict
to have lists as values. But that's because qa_dict
is updated in every iteration without the next values getting saved. Change this
for username,qtime,qtxt,rtxt in zip(q_username,q_time,q_text,r_text):
qa_dict={'username':username,'question_time':qtime,'question_text':qtxt,'url':url,'respond_text':rtxt,'responder_profile_url':'https://drakhosravi.com/about-us'}
to
qa_dict={'username':q_username,'question_time':q_time,'question_text':q_txt,'url':url,'respond_text':r_txt,'responder_profile_url':'https://drakhosravi.com/about-us'}
In other words, no need for loop. Since each of these is a list, you'll now have lists in dictionary qa_dict
as values.
Or if you want to create a list of dictionaries, bring data.append(qa_dict)
inside the for-loop.