I have a code which scrapes rotten tomatoes website for top 100 movies. After parsing, the data was put into a list. Here is the code:
# create and write headers to a list
rows = []
rows.append(['Rank', 'Rating', 'Title', 'No. of Reviews'])
print(rows)
# loop over results
for result in results:
# find all columns per result
data = result.find_all('td')
# check that columns have data
if len(data) == 0:
continue
# write columns to variables
rank = data[0].getText()
rating = data[1].getText()
title = data[2].getText()
reviews = data[3].getText()
# write each result to rows
rows.append([rank, rating, title, reviews])
print(rows)
And the output looks like this:
[['Rank', 'Rating', 'Title', 'No. of Reviews'], ['1.', '\n\n\n\xa096%\n\n', '\n\n Black Panther (2018)\n', '503'], ['2.', '\n\n\n\xa094%\n\n', '\n\n Avengers: Endgame (2019)\n', '514'], ['3.', '\n\n\n\xa093%\n\n', '\n\n Us (2019)\n', '520'], ['4.', '\n\n\n\xa097%\n\n', '\n\n Toy Story 4 (2019)\n', '433'], ['5.', '\n\n\n\xa098%\n\n', '\n\n The Wizard of Oz (1939)\n', '117'], ['6.', '\n\n\n\xa099%\n\n', '\n\n Lady Bird (2017)\n', '388']...
Then I wrote the data to a csv file.
# Create csv and write rows to output file
with open('rottentomato.csv','w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerows(rows)
But only column 'Rank' and 'No. of Reviews' have data. Column 'Rating' and 'Title' are empty.
I tried to reproduce your problem the only issue I found was that the special chars where creating empty spaces. You can clean those with strip
import csv
rows = [['Rank', 'Rating', 'Title', 'No. of Reviews'], ['1.', '\n\n\n\xa096%\n\n', '\n\nBlack Panther (2018)\n', '503'], ['2.', '\n\n\n\xa094%\n\n', '\n\nAvengers: Endgame (2019)\n', '514'], ['3.', '\n\n\n\xa093%\n\n', '\n\nUs (2019)\n', '520'], ['4.', '\n\n\n\xa097%\n\n', '\n\nToy Story 4 (2019)\n', '433'], ['5.', '\n\n\n\xa098%\n\n', '\n\nThe Wizard of Oz (1939)\n', '117'], ['6.', '\n\n\n\xa099%\n\n', '\n\nLady Bird (2017)\n', '388']]
for i, row in enumerate(rows):
for j, data in enumerate(row):
rows[i][j] = data.strip()
with open('rottentomato.csv','w', newline='') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerows(rows)
This was the output I got:
Rank,Rating,Title,No. of Reviews
1.,96%,Black Panther (2018),503
2.,94%,Avengers: Endgame (2019),514
3.,93%,Us (2019),520
4.,97%,Toy Story 4 (2019),433
5.,98%,The Wizard of Oz (1939),117
6.,99%,Lady Bird (2017),388