Search code examples
pythoncsvbeautifulsoupurllib2

Python, Beautifulsoup, CSV Array Output


I've been trying to figure this out for a while.. but being new to Python and BS I'm not getting far. Here's the code:

import urllib2
import csv
from bs4 import BeautifulSoup

urls =  ["https://coinmarketcap.com/currencies/bitcoin/historical-data/",
        "https://coinmarketcap.com/currencies/ethereum/historical-data/",
        "https://coinmarketcap.com/currencies/ripple/historical-data",
        "https://coinmarketcap.com/currencies/bitcoin-cash/historical-data",
        "https://coinmarketcap.com/currencies/litecoin/historical-data"]

for url in urls:
 page = urllib2.urlopen(url)
 soup = BeautifulSoup(page, "html.parser")

 row = soup.find("tr", attrs={"class":"text-right"})
 row2 = row.find_all("td")
 print (row2[0].text, row2[1].text, row2[2].text, row2[3].text, row2[4].text, row2[5].text)


Print Output:
    (u'Aug 08, 2018', u'6746.85', u'6746.85', u'6226.22', u'6305.80', u'5,064,430,000')
    (u'Aug 08, 2018', u'379.89', u'380.67', u'353.73', u'356.61', u'2,016,080,000')
    (u'Aug 08, 2018', u'0.380875', u'0.380875', u'0.326996', u'0.331944', u'360,857,000')
    (u'Aug 08, 2018', u'660.05', u'660.05', u'575.64', u'585.45', u'450,595,000')
    (u'Aug 08, 2018', u'68.16', u'68.16', u'62.14', u'62.49', u'313,187,000')

The 'Print Output' as above, is how I want the csv output to look.. however when I add the code for the csv writer I only get the last row of data from the array:

with open("hello world.csv",'wb') as f:
 wr = csv.writer(f)
 wr.writerows([(row2[0].text, row2[1].text, row2[2].text, row2[3].text, row2[4].text, row2[5].text)])


writerows Output:
(u'Aug 08, 2018', u'68.16', u'68.16', u'62.14', u'62.49', u'313,187,000')

Any help in making the csv output the same as the print would be greatly apprecaited!

Many thanks,

OM


Solution

  • Assuming that CSV-related code is inside the loop, the problem is that you keep creating the file over and over:

    with open("hello world.csv",'wb') as f:
    

    As explained in the docs, mode w is for:

    … writing (truncating the file if it already exists)

    If you want to append to an existing file instead of truncating the file and starting over, you use mode a.

    However, a much simpler solution is to just open the file once. Move the with open and the wr = csv.writer lines outside the loop. Then, each time through the loop, just write more rows into the existing wr.


    If that CSV code isn't inside the loop, then you have an additional problem: you're not even trying to write multiple rows; you're just looping over all of the rows and then, after you're done, writing the last one.

    If that's the case, you need to indent the writerows to be part of the loop, as well as making the other fix.


    Also, as a side note, if you want to write a single row, there's no need to create a single-element list with that row to pass to writerows, just call writerow with the row.