Search code examples
djangodjango-import-export

Exporting data with DJango and import_export, columns are duplicated


I am creating a django app and I'm using the import_export package. I have defined my resource with fields that have both the attribute and column_name set. When I export to xlsx (or csv) I get columns with the attribute as header, and a duplicate column with the column_name header.

Including the fields attribute in the Meta subclass doesn't affect this behavior.

# app\resources.py

class RoleResource(resources.ModelResource):
    name = Field(attribute="name", column_name="Sales Role")
    role = Field(attribute="default_role" column_name="System Role")
    plan = Field(attribute="default_plan" column_name="System Plan")

    class Meta:
        model = Role
        # fields = ('name', 'default_role', 'default_plan') # commenting this doesn't change behavior

The final output with Meta.fields commented out has 6 columns: Sales Role, System Role, System Plan, id, default_role, default_plan.

The final output with Meta.fields uncommented has 5 columns: Sales Role, System Role, System Plan, default_role, default_plan.

I thought the column_name was cosmetic. Why am I getting two duplicated columns?


Solution

  • Why am I getting two duplicated columns?

    Because you have a mixture of declared attributes, and fields in the fields Meta option, and these have different names.

    To fix: rename the declared fields to match the desired name:

    i.e.

    class RoleResource(resources.ModelResource):
        name = Field(attribute="name", column_name="Sales Role")
        default_role = Field(attribute="default_role" column_name="System Role")
        default_plan = Field(attribute="default_plan" column_name="System Plan")
    
        class Meta:
            model = Role
            fields = ('name', 'default_role', 'default_plan') 
    

    What's happening is that when the resource is instantiated, the logic looks to see which fields have been declared on the resource, and adds those to a local dict of fields. It then does the same for the declared fields. The attribute name is used as the key, so if they are different, then you get duplicates.

    For example:

    With declared fields matching fields options:

    class BookResource(resources.ModelResource):
        # declared names match `fields`
        name = Field(attribute="name", column_name="Book Name")
        author_email = Field(attribute="author_email", column_name="Author Email")
    
        class Meta:
            model = Book
            fields = ('id', 'name', 'author_email')
    

    Output:

    Book Name |Author Email     |id
    ----------|-----------------|--
    Foo       |email@example.com|1 
    

    With declared fields having different names:

    class BookResource(resources.ModelResource):
        # declared fields are different
        some_other_name = Field(attribute="name", column_name="Book Name")
        some_other_author_email = Field(attribute="author_email", column_name="Author Email")
    
        class Meta:
            model = Book
            fields = ('id', 'name', 'author_email')
    

    Output:

    Book Name |Author Email     |id|name      |author_email     
    ----------|-----------------|--|----------|-----------------
    Foo       |email@example.com|1 |Foo       |email@example.com
    

    The fields declaration is then used as a whitelist to determine which fields appear in the output. If you don't have a fields declaration, then all fields are shown.