Search code examples
pythonexport-to-csvreddit

Write a csv file after scraping data from reddit


I am new to coding and I am not being able to write a CSV file with the data I scraped from Reddit.

First, I scraped data using the pushshift API, which returned the results in a list format like the following image enter image description here

I want to write that data to a CVS file to run a content analysis in R. With each line (0000, 00001, etc) as a row. However, I have not been able to run a code that organizes each parameter in a column. For instance, I want the columns to be submissions.author, submissions.num_comments, submissions.title, to name a few.

I ran this piece of code but the results are not exactly what I'm looking for

import csv
 with open('my_file.csv', 'w') as f:
    writer = csv.writer(f)
    with open('my_file.csv', 'w') as f:
      for row in lastest_submissions:
        row_text = ','.join(row) + '\n'  
        f.write(row_text)

The outcome looks like this enter image description here

What I would like is that the name of the parameter is the header and the parameter answer is the content in each cell. For example, for parameter 'author':'bl00d', the header would be author and the content in the cell would be bl00d (for the line 0000).

I appreciate the help and hints I could get. Also, let me know if I should provide the complete code


Solution

  • In your case as you already have the data in the form of list of dictionaries I think you may wanna try using csv.Dictwriter()

    A sample code piece:

    import csv
    lstdc = [{'name':'Jack', 'age': 26}, 
            {'name':'John', 'age': 27},
            {'name':'Lisa', 'age': 36},
            {'name':'Adam', 'age': 16}]
    
    fieldNames = list((lstdc[0]).keys())
    
    with open('list_of_dict_to_csv.csv','w', newline='\n') as f:
        writer = csv.DictWriter(f, fieldNames)
        writer.writeheader()
        for val in lstdc:
            writer.writerow(val)
    

    you can replace the lstdc with latest_submissions and list_of_dict_to_csv.csv with my_file.csv

    Replacing the iteration of list dictionaries with built in writerows()

    with open('list_of_dict_to_csv.csv','w', newline='\n') as f:
        writer = csv.DictWriter(f, fieldNames)
        writer.writeheader()
        writer.writerows(lstdc)
    

    Hope this helps!