Search code examples
pythonjsonbeautifulsoupzip

Convert list using zip to dictionary


When I convert lists to the dictionary, only one element is stored in the dictionary, and I don't know why. this is my code. I used BeautifulSoup in web scraping and store in lists. Finally, I want to clean them and store them in JSON file

data = []

qa_dict={}

q_text=[]

q_username=[]
q_time=[]
r_text=[]

for i in range(1,2):
    print(i)

    url='https://drakhosravi.com/faq/?cat=all&hpage='+str(i)
    response1 = requests.get(url).content.decode()

    soup = BeautifulSoup(response1,'html.parser')

    for s in soup.select('span.hamyar-comment-person-name') :
        if(s.text!='دکتر آرزو خسروی'):
          q_username.append(s.text.strip())

    for times in soup.select('div.comment-header'):
            for s in times.select('span.hamyar-comment-date') :
                q_time.append(s.text.strip())

    question=[]
    for comment in soup.select('div.comment-body'):
        question.append(comment.text.strip())

    q_text=question[0::2]

    for r in soup.select('ol.faq-comment_replies'):
        for head in r.select('li'):
          for t in head.select('div.comment-body'):
            r_text.append(t.text.strip())

    for username,qtime,qtxt,rtxt in zip(q_username,q_time,q_text,r_text):

        qa_dict= {'username':username,'question_time':qtime,'question_text':qtxt,'url':url,'respond_text':rtxt,'responder_profile_url':'https://drakhosravi.com/about-us'}

    data.append(qa_dict)

with open('drakhosravi2.json', 'w+', encoding="utf-8") as handle:
    json.dump(qa_dict, handle, indent=4, ensure_ascii=False)

Solution

  • I don't fully understand your question but you seem to want qa_dict to have lists as values. But that's because qa_dict is updated in every iteration without the next values getting saved. Change this

    for username,qtime,qtxt,rtxt in zip(q_username,q_time,q_text,r_text):
      qa_dict={'username':username,'question_time':qtime,'question_text':qtxt,'url':url,'respond_text':rtxt,'responder_profile_url':'https://drakhosravi.com/about-us'}
    

    to

    qa_dict={'username':q_username,'question_time':q_time,'question_text':q_txt,'url':url,'respond_text':r_txt,'responder_profile_url':'https://drakhosravi.com/about-us'}
    

    In other words, no need for loop. Since each of these is a list, you'll now have lists in dictionary qa_dict as values.

    Or if you want to create a list of dictionaries, bring data.append(qa_dict) inside the for-loop.