Search code examples
pythonpython-2.7csvweb-crawlerexport-to-csv

How to export a CSV file with a header using python 2.7


I'm trying to figure out how to export the results of my script to a CSV file with python 2.7. The CSV file should contain two columns:

The first column should contain the URL results and I would like to give this column a name. The second column should contain the print result keyword found or keyword NOT found (as seen after the first and second print function in my code). I would like to name the second column as well.

My code:

import urllib2
    
keyword = ['viewport']

with open('top1m-edited.csv') as f:
    for line in f:
        strdomain = line.strip()
        if '.nl' in strdomain:
            try:
                req = urllib2.Request(strdomain.strip())
                response = urllib2.urlopen(req)
                html_content = response.read()

                for searchstring in keyword:
                    if searchstring.lower() in str(html_content).lower():
                        print (strdomain, keyword, 'keyword found')

                    else:
                        print (strdomain, 'keyword NOT found')

            except urllib2.HTTPError:
                print (strdomain,'HTTP ERROR')

            except urllib2.URLError:
                print (strdomain,'URL ERROR')

            except urllib2.socket.error:
                print (strdomain,'SOCKET ERROR')

            except urllib2.ssl.CertificateError:
                print (strdomain,'SSL Certificate ERROR')

f.close()

So what kind of code do I need to add in order to achieve my goal?


Solution

  • You can use the ','.join() method to convert a list into a string with a comma separator.

    with open('my_file.csv', 'w') as f:
        # Write out your column headers
        f.write(','.join(['column1header', 'column2header']))
    
        # Replace your for loop with this to write to file instead of stdout
        for searchstring in keyword:
            if searchstring.lower() in str(html_content).lower():
                f.write(','.join([strdomain, 'keyword found']) + '\n')
            else:
                f.write(','.join([strdomain, 'keyword NOT found']) + '\n')
                print (strdomain, 'keyword NOT found')