I have a table that looks something like
class PodUsage(models.Model):
pod_id = models.CharField(max_length=256, db_index=True)
pod_name = models.CharField(max_length=256)
start_time = models.DateTimeField(blank=True, null=True, default=timezone.now)
end_time = models.DateTimeField(blank=True, null=True)
anonymised = models.BooleanField(default=False)
user = models.ForeignKey("accounts.ServiceUser", null=True, blank=True, on_delete=models.CASCADE)
As part of our GDPR requirement, we need to anonymize data after a certain period, which I could absolutely do as a loop:
count = 0
records = PodUsage.objects.filter(
anonymised=False,
start_time__lte=timezone.now() - timedelta(weeks=settings.DATA_ANONYMISING_PERIOD_WEEKS)
)
for record in records:
record.pod_name = hashlib.sha256(record.pod_name.encode('utf-8')).hexdigest()
record.user = None
record.anonymised = True
record.save()
count += 1
# Log count somewhere
however I think I should be able to do it with an update
function:
count = PodUsage.objects.filter(
anonymised=False,
start_time__lte=timezone.now() - timedelta(weeks=settings.DATA_ANONYMISING_PERIOD_WEEKS)
).update(
pod_name = hashlib.sha256(pod_name.encode('utf-8')).hexdigest(),
user = None,
anonymised = True
)
# Log count somewhere
..... but I can't figure out the correct incantation to reference the field in the update portion
pod_name
is not definedsha256("pod_name".encode('utf-8'))
obviously just encodes the string "pod_name"sha256(F("pod_name").encode('utf-8'))
breaks the code with 'F' object has no attribute 'encode'
Any suggestions?
You can use the SHA256
function [Django-doc] to let the database hash:
from django.db.models.functions import SHA256
count = PodUsage.objects.filter(
anonymised=False,
start_time__lte=timezone.now()
- timedelta(weeks=settings.DATA_ANONYMISING_PERIOD_WEEKS),
).update(pod_name=SHA256('pod_name'), user=None, anonymised=True)