Search code examples
algorithmimage-processingpngzlibdeflate

How to implement a PNG decoder completely from scratch


I started to work on a PNG encoding/decoding library for learning purposes so I want to implement every part of it by hand.

I got pretty long with it but now I'm a bit stuck. Here are the things I succesfully implemented already:

  • I can load a PNG binary and go through its bytes
  • I can read the signature and the IHDR chunk for metadata
  • I can read the IDAT chunks and concatenate the image data into a buffer
  • I can read and interpret the zlib headers from the above mentioned image data

And here is where I got stuck. I vaguely know the steps from here which are:

  • Extract the zlib compressed data according to its headers
  • Figure out the filtering methods used and "undo" them to get the raw data
  • If everything went correctly, now I have raw RGB data in the form of [<R of 1st pixel>, <G of 1st pixel>, <B of 1st pixel>, <R of 2nd pixel>, <G of 2nd pixel>, etc...]

My questions are:

  • Is there any easy-to-understand implementation (maybe with examples) or guide on the zlib extraction as I found the official specifications hard to understand
  • Can there be multiple filtering methods used in the same file? How to figure these out? How to figure out the "borders" of these differently filtered parts?
  • Is my understanding of the how the final data will look like correct? What about the alpha channel or when a palette is used?

Solution

    1. Yes. You can look at puff.c, which is an inflate implementation written with the express purpose of being a guide to how to decode a deflate stream.

    2. Each line of the image can use a different filter, which is specified in the first byte of the decompressed line.

    3. Yes, if you get it all right, then you will have a sequence of pixels, where each pixel is a grayscale value, G, that with an alpha channel, GA, RGB (red-green-blue, in that order), or RGBA.