Search code examples
pythondjangogziphttp-streaming

Django - HTTPStream of gzip file creaed on the fly


I used StringIO to create a file object that holds the xml data, then I created a gzip file with this file object, I am having trouble making this stream over HTTP with django, the file size is not always fixed and sometime it can be big, that is why I oppted for the HTTPStream instead of a normale HTTP response. I can't alos figure out how to send the file lenght, because the file object is not seekable.

Thank you for your help, cheers !

Here is my code:

# Get the XML string
xml_string = get_xml_cebd(cebds)

# Write the xml string to the string io holder
cebd_xml_file.write(xml_string)

# Flush the xml file buffer
cebd_xml_file.flush()

# Create the Gzip file handler, with the StringIO in the fileobj
gzip_handler = gzip.GzipFile(
    fileobj=cebd_xml_file,
    filename=base_file_name + '.xml',
    mode='wb'
)

# Write the XML data into the gziped file
gzip_handler.write(cebd_xml_file.getvalue())
gzip_handler.flush()

# Generate the response using the file warpper, content type: x-gzip
# Since this file can be big, best to use the StreamingHTTPResponse
# that way we can garantee that the file will be sent entirely
response = StreamingHttpResponse(
    gzip_handler,
    content_type='application/x-gzip'
)

# Add content disposition so the browser will download the file
response['Content-Disposition'] = ('attachment; filename=' +
    base_file_name + '.xml.gz')

# Content size
gzip_handler.seek(0, os.SEEK_END)
response['Content-Length'] = gzip_handler.tell()

Solution

  • I found the solution to both problems, the handler that was supposed to be passed to the HTTPStream is the StringIO one, not the Gzip handler, also the StringIO handler is seekable so that way I can check the size of the data after gzipped, another trick is to call the close method on the gzip handler so it will add the crc32 and size to the gzip file otherwise the data sent will be 0, for the StringIO don't call the close method, because the HTTPStream will need the handler open to stream the data, the garbage collector will close it after the Stream is done.

    This is the final code:

    # Use cStringIO to create the file on the fly
    cebd_xml_file = StringIO.StringIO()
    
    # Create the file name ...
    base_file_name = "cebd"
    
    # Get the XML String
    xml_string = get_xml_cebd(cebds)
    
    # Create the Gzip file handler, with the StringIO in the fileobj
    gzip_handler = gzip.GzipFile(
        fileobj=cebd_xml_file,
        filename=base_file_name + '.xml',
        mode='wb'
    )
    
    # Write the XML data into the gziped file
    gzip_handler.write(xml_string)
    
    # Flush the data
    gzip_handler.flush()
    
    # Close the Gzip handler, the close method will add the CRC32 and the size
    gzip_handler.close()
    
    # Generate the response using the file warpper, content type: x-gzip
    # Since this file can be big, best to use the StreamingHTTPResponse
    # that way we can garantee that the file will be sent entirely
    response = StreamingHttpResponse(
        cebd_xml_file.getvalue(),
        content_type='application/x-gzip'
    )
    
    # Add content disposition so the browser will download the file, don't use mime type !
    response['Content-Disposition'] = ('attachment; filename=' +
        base_file_name + '.xml.gz')
    
    # Content size
    cebd_xml_file.seek(0, os.SEEK_END)
    response['Content-Length'] = cebd_xml_file.tell()
    
    # Send back the response to the request, don't close the StringIO handler !
    return response
    

    Cheers, hope this can help anyone.