I am using Django 4.2 with python3.8. I have a Mysql8 DB containing a Model as
class A(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=522)
class B(models.Model):
name = models.CharField(max_length=255, unique=True)
a = models.ManyToManyField(A)
i want to create a function which when provided id for A should be able to query all instances of B for A but because B can be very large like millions of rows. want to iterate over it in batches of 1000 without loading it in memory.
Is it possible to do so?
I wanted to use co-routines as i think i can use it to build a iterator like object.
any other alternatives to make sure it is performant enough so i can parallelize it over for 100+ organizations.
i would like a code snippets if possible so i can understand this better
context: This is a management command that needs to be run as cronjob everyday so i need to run fast
You can use Django's __range
operator to break up the queryset into chunks:
def f():
b = B.objects.first()
a_count = b.a.all().count()
chunk_size = 1000
for i in range(0, a_count, chunk_size):
yield b.a.filter(a__pk__range = (i, chunk_size))
This will yield the result of b.a.all()
into chunks of 1000 rows.