Search code examples
djangoamazon-web-servicesamazon-s3filefield

How to use Django FileField with dynamic Amazon S3 bucket?


I have a Django model with a Filefield, and a default storage using Amazon S3 bucket (via the excellent django-storage).

My problem is not to upload files to a dynamic folder path (as we see in many other answers). My problem is deeper and twofold:

  • Files are already inside an Amazon S3 bucket, and I do not want to download-reupload them (worse: I have only a read-access to them).
  • Files are accessible through S3 credentials that could be different from one file to another (that is, files can be located inside different buckets, and access through different credentials). Hence, my FileField must have a dynamic storage.

Any idea?

(Djabgo 1.11, Python 3).


Solution

  • It turns out it is not so difficult. But the code below isn't much tested, and I must warn you to not copy-paste without checking!

    I have created a custom FileField subclass:

    class DynamicS3BucketFileField(models.FileField):
        attr_class = S3Boto3StorageFile
        descriptor_class = DynamicS3BucketFileDescriptor
    
        def pre_save(self, model_instance, add):
            return getattr(model_instance, self.attname)
    

    Note that the attr_class is specifically using the S3Boto3StorageFile class (a File subclass provided by django-storages).

    The pre_save overload has only one goal: avoid the internal file.save call that would attempt to re-upload the file.

    The magic happens inside the FileDescriptor subclass:

    class DynamicS3BucketFileDescriptor(FileDescriptor):
        def __get__(self, instance, cls=None):
            if instance is None:
                return self
    
            # Copied from FileDescriptor
            if self.field.name in instance.__dict__:
                file = instance.__dict__[self.field.name]
            else:
                instance.refresh_from_db(fields=[self.field.name])
                file = getattr(instance, self.field.name)
    
            # Make sure to transform storage to a Storage instance.
            if callable(self.field.storage):
                self.field.storage = self.field.storage(instance)
    
            # The file can be a string here (depending on when/how we access the field).
            if isinstance(file, six.string_types):
                # We instance file following S3Boto3StorageFile constructor.
                file = self.field.attr_class(file, 'rb', self.field.storage)
                # We follow here the way FileDescriptor work (see 'return' finish line).
                instance.__dict__[self.field.name] = file
    
            # Copied from FileDescriptor. The difference here is that these 3
            # properties are set systematically without conditions.
            file.instance = instance
            file.field = self.field
            file.storage = self.field.storage
            # Added a very handy property to file.
            file.url = self.field.storage.url(file.name)
    
            return instance.__dict__[self.field.name]
    

    The code above takes some internal code of FileDescriptor adapted to my case. Note the if callable(self.field.storage):, explained below.

    The key line is: file = self.field.attr_class(file, 'rb', self.field.storage), which automatically creates a valid instance of S3Boto3StorageFile depending on the content of the current file instance (sometimes, it's a file, sometimes it's a simple string, that's part of the FileDescriptor business).

    Now, the dynamic part comes quite simply. In fact, when declaring a FileField, you can provide to the storage option, a function. Like this:

    class MyMedia(models.Model):
        class Meta:
            app_label = 'appname'
    
        mediaset = models.ForeignKey(Mediaset, on_delete=models.CASCADE, related_name='media_files')
        file = DynamicS3BucketFileField(null=True, blank=True, storage=get_fits_file_storage)
    

    And the function get_fits_file_storage will be called with a single argument: the instance of MyMedia. Hence, I can use any property of that object, to return the valid storage. In my case mediaset, which contains a key that allow me to retrieve an object containing S3 credentials with which I can build a S3Boto3Storage instance (another class provided by django-storages).

    Specifically:

    def get_fits_file_storage(instance):
        name = instance.mediaset.archive_storage_name
        return instance.mediaset.archive.bucket_keys.get(name= name).get_storage()
    

    Et voilà!