Search code examples
pythonimageimage-processing

Get Image size WITHOUT loading image into memory


I understand that you can get the image size using PIL in the following fashion

from PIL import Image
im = Image.open(image_filename)
width, height = im.size

However, I would like to get the image width and height without having to load the image in memory. Is that possible? I am only doing statistics on image sizes and dont care for the image contents. I just want to make my processing faster.


Solution

  • As the comments allude, PIL does not load the image into memory when calling .open. Looking at the docs of PIL 1.1.7, the docstring for .open says:

    def open(fp, mode="r"):
        "Open an image file, without loading the raster data"
    

    There are a few file operations in the source like:

     ...
     prefix = fp.read(16)
     ...
     fp.seek(0)
     ...
    

    but these hardly constitute reading the whole file. In fact .open simply returns a file object and the filename on success. In addition the docs say:

    open(file, mode=”r”)

    Opens and identifies the given image file.

    This is a lazy operation; this function identifies the file, but the actual image data is not read from the file until you try to process the data (or call the load method).

    Digging deeper, we see that .open calls _open which is a image-format specific overload. Each of the implementations to _open can be found in a new file, eg. .jpeg files are in JpegImagePlugin.py. Let's look at that one in depth.

    Here things seem to get a bit tricky, in it there is an infinite loop that gets broken out of when the jpeg marker is found:

        while True:
    
            s = s + self.fp.read(1)
            i = i16(s)
    
            if i in MARKER:
                name, description, handler = MARKER[i]
                # print hex(i), name, description
                if handler is not None:
                    handler(self, i)
                if i == 0xFFDA: # start of scan
                    rawmode = self.mode
                    if self.mode == "CMYK":
                        rawmode = "CMYK;I" # assume adobe conventions
                    self.tile = [("jpeg", (0,0) + self.size, 0, (rawmode, ""))]
                    # self.__offset = self.fp.tell()
                    break
                s = self.fp.read(1)
            elif i == 0 or i == 65535:
                # padded marker or junk; move on
                s = "\xff"
            else:
                raise SyntaxError("no marker found")
    

    Which looks like it could read the whole file if it was malformed. If it reads the info marker OK however, it should break out early. The function handler ultimately sets self.size which are the dimensions of the image.