I am testing and preparing a new Django package for using bleach with Text and Char fields in the Django ORM and with DRF. I've hit a bit of a roadblock with it however and it has made me take pause and wonder if I truly understand how a models fields are instantiated. Hopefully someone can clear this up.
I am initialising the arguments for bleach by loading a default settings dict from django.conf.settings, and then checking a field_args parameter to see if any have been overridden for a specific field definition, like below. This is then used in the pre_save function to call bleach:
class BleachedCharField(CharField):
"""
An enhanced CharField for sanitising input with the Python library, bleach.
"""
def __init__(self, *args, field_args=None, **kwargs):
"""
Initialize the BleachedCharField with default arguments, and update with called parameters.
:param tags: (dict) optional bleach argument overrides, format matches BLEACHFIELDS defaults.
:param args: extra args to pass to CharField __init__
:param kwargs: undefined args
"""
super(BleachedCharField, self).__init__(*args, **kwargs)
self.args = settings.BLEACHFIELDS or None
if field_args:
if 'tags' in field_args:
self.args['tags'] = field_args['tags']
if 'attributes' in field_args:
self.args['attributes'] = field_args['attributes']
if 'styles' in field_args:
self.args['styles'] = field_args['styles']
if 'protocols' in field_args:
self.args['protocols'] = field_args['protocols']
if 'strip' in field_args:
self.args['strip'] = field_args['strip']
if 'strip_comments' in field_args:
self.args['strip_comments'] = field_args['strip_comments']
def pre_save(self, model_instance, add):
"""
Clean text, update model and return cleaned text.
:param model_instance: (obj) model instance
:param add: default textfield parameter, unused
:return: clean text as unicode
"""
bleached = clean(getattr(model_instance, self.attname), **self.args)
setattr(model_instance, self.attname, bleached)
return bleached
The problem I am having is that the self.args
value for all fields on a model seems to be the value of the last field loaded on the model. So for example on this model:
class Writing(models.Model):
"""
Stores a single writing of a specific Form ( relation :model:`writings.WritingForm` ) and
Category ( relation :model:`writings.Category` ).
"""
author = models.ForeignKey(
settings.AUTH_USER_MODEL,
on_delete=models.CASCADE,
help_text=trans("Author")
)
title = BleachedCharField(
max_length=200,
help_text=trans("Title")
)
created = models.DateTimeField(
auto_now_add=True,
help_text=trans("First created.")
)
edited = models.DateTimeField(
auto_now_add=True,
help_text=trans("Last edited.")
)
description = BleachedTextField(
blank=True,
help_text=trans("A short description of the writing to entice potential readers.")
)
body = BleachedTextField(
field_args=settings.PERMISSIVE_BLEACHFIELDS,
help_text=trans("The body of the writing itself.")
)
writing_form = models.ForeignKey(
WritingForm,
on_delete=models.CASCADE,
help_text=trans("Primary writing form.")
)
category = models.ForeignKey(
Category,
on_delete=models.CASCADE,
help_text=trans("Writing form category")
)
slug = models.SlugField(
editable=False,
help_text=trans("URL and SEO friendly lower-cased string."),
unique=True
)
comments = GenericRelation(settings.COMMENT_MODEL)
On this model the body
field which is the last field on the model overrides the self.args of all the BleachCharField and BleachedTextField instances before it, so they all take the same parameters.
Am I missing something on this? Is self.args not being added to the fields, but to the model instance instead? Is that why the last fields settings override all the field settings? How should I be doing this to avoid this issue?
For added clarity I am adding the BEACHFIELDS default dict and the PERMISSIVE_BLEACHFIELDS dict:
BLEACHFIELDS = {
'tags': [],
'attributes': {},
'styles': [],
'protocols': [],
'strip': True,
'strip_comments': True
}
PERMISSIVE_BLEACHFIELDS = {
'tags': ['b', 'em', 'i', 'strong', 'span'],
'attributes': {'span': ['style']},
'styles': ['text-decoration', 'font-weight'],
'strip_comments': False
}
settings.BLEACHFIELDS
is a single mutable dictionary. So all instances' self.args
point to the same object. When you mutate that object, that will affect all instances.
self.args = settings.BLEACHFIELDS or None
One way to fix this is to use copy.deepcopy()
import copy # standard library module
self.args = copy.deepcopy(settings.BLEACHFIELDS or {})
Also, self.args
can't be None
. It must be a dictionary or later lines will raise errors.
Finally, if all you want to do is to create a shallow merge of two dictionaries, you can do that using the **
unpack operator (if you are using python 3.5+) Then you don't need all those if
blocks.
self.args = {**settings.BLEACHFIELDS, **field_args}
This will create a new dictionary. But nested lists or dictionaries will be shared with other instances, so don't do any mutations on the nested data structures.