Search code examples
pythonazureazure-functionsexport-to-csvazure-blob-storage

How to write to azure output blob with python csv writer?


I am trying to read in an excel file from blob storage using an azure function, and output a csv file. I have successfully read in the excel file with openpyxl and processed it, but I am having trouble with writing to the output blob. I am not very familiar with python streams. In my original code I was opening a local file, but now I am trying to open a stream and write to it and then save that stream to the output blob. My function.json looks like

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "name": "myblob",
      "type": "blobTrigger",
      "direction": "in",
      "path": "example/{name}.xlsx",
      "connection": "AzureWebJobsStorage"
    },
    {
      "name": "outputblob",
      "type": "blob",
      "path": "example/processed-csv.csv",
      "connection": "AzureWebJobsStorage",
      "direction": "out"
    }
  ]
}

My code looks like

logging.info(f"Python blob trigger function processed blob \n"
                 f"Name: {myblob.name}\n"
                 f"Blob Size: {myblob.length} bytes")
    print('loading workbook')
    filename = myblob.name
    wb = openpyxl.load_workbook(filename = BytesIO(myblob.read()))
    logging.info(f"Loaded workbook \n")
### Excel processing here
    logging.info(f"writing csv \n")
        with StringIO() as f:
            c = csv.writer(f)
            for r in coreWorksheet.rows:
                c.writerow([cell.value for cell in r])
            outputblob.set(f)

I am getting an error message of ValueError: I/O operation on closed file with this code, though I have attempted to write to the output blob in a couple different ways.


Solution

  • You need to use outputblob.set(f.getvalue()).

    Please refer to the following code:

    import logging
    import openpyxl
    import azure.functions as func
    import io
    import csv
    
    
    def main(myblob: func.InputStream, outputblob: func.Out[func.InputStream]):
        logging.info(f"Python blob trigger function processed blob \n"
                     f"Name: {myblob.name}\n"
                     f"Blob Size: {myblob.length} bytes")
        logging.info('loading workbook')
        wb = openpyxl.load_workbook(filename = io.BytesIO(myblob.read()))
        coreWorksheet = wb['testSheet1']
        logging.info(coreWorksheet)
        ### Excel processing here
        logging.info(f"writing csv \n")
        with io.StringIO() as f:
            c = csv.writer(f)
            for r in coreWorksheet.rows:
                c.writerow([cell.value for cell in r])
            outputblob.set(f.getvalue())