Search code examples
pythondjangomemorypython-imaging-librarysorl-thumbnail

Django / Python / PIL / sorl-thumbnail generation in bulk - memory error


I'm trying to bulk generate 4 thumnails for each of around 40k images with sorl-thumbnail for my django app. I iterate through all django objects with an ImageWithThumbnailsFieldFile, and then call its generate_thumbnails() function.

This works fine, except that after a few hundred iterations, I run out of memory and my loop crashes with 'memory error'. Since sorl-thumbnail uses PIL to generate thumbs, it seems to be that PIL doesn't return all of the memory it used when generated a thumb.

Does anybody how to avoid this problem, e.g. by forcing PIL to return the memory it no longer needs?

my code simply looks like this:

all = Picture.objects.all()
for i in all:
    i.image.generate_thumbnails()

The function generate-thumbnail starts here, line 129.

Thanks in advance for any advice!

Martin


Solution

  • Your problem relates to how Django caches the results of a queryset as you loop through them. Django keeps all the objects in memory so that next time you iterate through the same queryset you don't have to hit the database again to get all the data.

    What you need to do is use the iterator() method. So:

    all = Picture.objects.all().iterator()
    for i in all:
        i.image.generate_thumbnails()