Parse bytes to str while reading csv with Python

While I python code that write and read to csv file utf8 string

import csv

test1='ab"cc"dd'.encode('utf8')
test2='bbb'.encode('utf8')
csv_file = open('test.csv','w')
writer= csv.writer(csv_file)
writer.writerow([test1,test2])
csv_file.close()

with open('test.csv', newline='') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',', quotechar='"')
    print(spamreader)
    for row in spamreader:
        print(', '.join(row))

The problem is that when I read I got b'ab"cc"dd', b'bbb' instead of ab"cc"dd,bbb

How can I decode that string (I must put utf8 into csv) ?

Solution

No need for manual encoding/decoding. Open the file with the specific encoding you want because the default encoding varies by OS configuration. This is called the "Unicode sandwich". Encode/decode when writing/reading the file and work with Unicode only within the Python script.

Also, csv.reader and csv.writer expect Unicode strings, so providing encoded byte strings is incorrect.

import csv

test1 = 'ab"cc"dd'
test2 = 'bbb'
with open('test.csv', 'w', encoding='utf8', newline='') as csv_file:
    writer= csv.writer(csv_file)
    writer.writerow([test1,test2])

with open('test.csv', encoding='utf8', newline='') as csvfile:
    spamreader = csv.reader(csvfile)
    for row in spamreader:
        print(row)
        print(', '.join(row))

['ab"cc"dd', 'bbb']
ab"cc"dd, bbb

Additionally, if you want your .CSV files to be readable in Microsoft Excel, use utf-8-sig as the encoding or it won't detect UTF-8 properly.