I have the following class:
class VideoFile(models.Model):
media_file = models.FileField(upload_to=update_filename, null=True)
And when I try to upload large files to it (from 100mb up to 2Gb) using the following request, it can take quite a long time after the upload process, and during the VideoFile.save()
process.
def upload(request):
video_file = VideoFile.objects.create(uploader=request.user.profile)
video_file.media_file = uploaded_file
video_file.save()
On my Macbook Pro Core i7, 8Gb RAM, a 300mb uploaded file can take around 20 seconds or so to run video_file.save()
I suspect this delay is relating to a disk copy operation from /tmp
to the files permanent location? I've proven this by running watch ls -l
on the target directory, and as soon as video_file.save()
runs, I can see the file appear and grow throughout the delay.
Is there a way to eliminate this file transfer delay? Either by uploading the file directly to the target filename or just by moving the original file instead of copying? This is not the only upload operation across the site however so any solution needs to be localized to this model.
Thanks for any advice!
UPDATE:
Just further evidence to support a copy instead of a move, i can watch lsof
during the upload and see a file within /private/var/folders/...
written from python which maps exactly to upload progress. After upload is complete, another lsof entry appears for the ultimate file location which grows over time. After that's complete, both entries disappear.
Ok after a bit of digging I've come up with a solution. It turns out Django's default storage already attempts to move the file instead of copy, which it first tests:
hasattr(content, 'temporary_file_path')
This attribute exists for the class TemporaryUploadedFile
which is the object returned to the Upload View, however the field itself is created as the class specified by FileField.attr_class
So instead I decided to subclass FieldFile
and FileField
and slot in the temporary_file_path
attribute:
class VideoFieldFile(FieldFile):
_temporary_file_path = None
def temporary_file_path(self):
return self._temporary_file_path
class VideoFileField(FileField):
attr_class = VideoFieldFile
Finally in the view, before saving the model I manually assigned the temp path:
video_file.media_file._temporary_file_path = uploaded_file.temporary_file_path()
This now means my 1.1Gb test file becomes available in about 2-3 seconds rather than the 50 seconds or so I was seeing before. It also comes with the added benefit that if the files exist on different file systems, it appears to fall back to the copy operation.
As a side note however, my site is not utilizing MemoryFileUploadHandler
which some sites may use to handle smaller file uploads, so I'm not sure how nice my solution might work with that, but I'm sure it would be simple enough to detect the uploaded file's class and act accordingly.