Search code examples
pythoncsvodoodecodeencode

How to properly open and encode CSV file in Python to be processed in Odoo framework


I tried to import a CSV file in Odoo custom module, but my logic stopped at some point where I decode the file object. Below is my code:

def import_csv(self, csv_file):

    reader = csv.reader(csv_file)
    next(reader)
    for row in reader:
        record = {
            'name'                  : row[0],
            'component_name'        : row[1],
            'percentage'            : row[2],
            'processing_start_date' : row[3],
            'finished_real_date'    : row[4],
        }
        self.env['item.master'].create(record)

def action_import_csv(self):
    outfile = open('test.csv', 'r')
    data_record = outfile.read()
    ir_values = {
        'name': 'test.csv',
        'datas': data_record,
    }
    data_id = self.env['ir.attachment'].sudo().create(ir_values)
    self.import_csv(data_id)

It raises an error:

binascii.Error: Invalid base64-encoded string: number of data characters (141) cannot be 1 more than a multiple of 4

What is actually wrong in my code?

I've tried to put this line too:

data_record = base64.b64encode(outfile.read())

Right after the file opened, but a different error is raised:

TypeError: a bytes-like object is required, not 'str'


Solution

  • When saving an attachment, you need to base64-encode it; likewise when retrieving it, it must be base64-decoded.

    Here is how you might create an attachment instance (in the Odoo 14 shell):

    >>> import base64, csv, io
    >>> # Example csv data.
    >>> data = """a,b,c\n1,2,3\n4,5,6"""
    >>> Att = env['ir.attachment']
    >>> # Encoding as UTF-8 is not required if the data is already bytes, for example if
    >>> # you read the csv file in binary mode ('rb').
    >>> att = Att.create({'name': 'foo', 'datas': base64.b64encode(data.encode('utf-8')), 'mimetype': 'text/csv'})
    >>> att.datas
    b'YSxiLGMKMSwyLDMKNCw1LDY='
    >>> env.cr.commit()
    

    Here is how you can retrieve the data, and pass it to the csv reader.

    >>> Load the decoded data into a file-like object that csv.reader can use.
    >>> buf = io.StringIO(base64.b64decode(att.datas).decode('utf-8'))
    >>> reader = csv.reader(buf)
    >>> list(reader)
    [['a', 'b', 'c'], ['1', '2', '3'], ['4', '5', '6']]
    >>> buf.close()
    >>>
    

    Your code might look like this (untested):

        def import_csv(self, attachment):
    
            # The correct encoding will be that used to encode the original file.
            # Modern systems will use UTF-8, but some Windows systems could use UTF-8-SIG, 
            # UTF-16 or a legacy 8-bit encoding like cp1252.
            csv_data = base64.b64decode(attachment.datas).decode('utf-8')
            csv_file = io.StringIO(csv_data)
            
            reader = csv.reader(csv_file)
            next(reader)
            for row in reader:
                record = {
                    'name'                  : row[0],
                    'component_name'        : row[1],
                    'percentage'            : row[2],
                    'processing_start_date' : row[3],
                    'finished_real_date'    : row[4],
                }
                self.env['item.master'].create(record)
    
        def action_import_csv(self):
            outfile = open('test.csv', 'rb')
            data_record = outfile.read()
            ir_values = {
                'name': 'test.csv',
                'datas': base64.b64encode(data_record),
            }
            data_id = self.env['ir.attachment'].sudo().create(ir_values)
            self.import_csv(data_id)