Search code examples
djangopostgresqldjango-modelsdjango-orm

how to fetch data in chunks from db in django and then delete them?


My basic problem statement is that I want to fetch every row but in chunks only 2000 rows at one time from a table having 1 million rows. And after these chunked queries are evaluated i want to delete every row.

so just say i have

a = Model.objects.filter(id=1<2000)
b = Model.objects.filter(id=2000<4000)
c = Model.objects.filter(id=4000<6000)
..
..

now if i combine all these queryset into one queryset by some means say

del = a + b + c +d ......

and i do

del.delete()

so will it delete all the rows in less time or it will re process queries again to delete them and will take time?


Solution

  • Queryset's are lazy-evaluated so this will have no performance boost effect. But, ff you want to decrease memory consumption, then you can use (iterator)[https://docs.djangoproject.com/en/3.0/ref/models/querysets/#iterator] with chunk_size parameter set:

    for record in Model.objects.all().iterator(chunk_size=2000):
        record.delete()
    

    Else, if you are actually looking for improving deletion speed, then you can try to use undocumented method _raw_delete

    a = Model.objects.all()
    a._raw_delete(a.db)
    

    only if:

    1. your models don't have cascade deleting (like foreign keys related to this model with on_delete=CASCADE)
    2. your model have no signals to be handled