Recently I started writing a png decoder for fun.And for understanding the format and compression I have read the following
PNG (Portable Network Graphics) Specification, Version 1.2)
RFC 1950("ZLIB Compressed Data Format Specification")
RFC 1951 ("DEFLATE Compressed Data Format Specification")
And hence I have an simple understanding of the format and compression.
So at first I decided to implement a simple decoder which will decode a simple png image file,and for that i have choosen a simple png image containing a single IDAT chunk which contains a single zlib block and deflate block.
And image file used is this Image file used for decoding
And i have extracted the zlib block from the the image file and in the hex editor it looks like this Hex view of IDAT CHUNK
And the binary representation of the part marked in red is this Binary representation of zlib block
Now from what i have understood from reading the specs i have decoded it as follows Decoded binary representation of zlib block
BFINAL=FINAL BLOCK
BTYPE=DYNAMIC HUFFMAN
HLIT=29
HDIST=29
HCLEN=11
The parts marked in green is the (HCLEN + 4) code lengths for the code length alphabet.
The read lengths are as follows 6,6,0,2,3,3,5,4,4,4,3,6,4,6,5
The generated huffman codes for above code bit lengths are as follows Generated huffman code
And after assigning them to corresponding length alphabet is as follows(Note:The length alphabet 18 is not used as its code length was zero) Assigned huffman codes
Now when I started to decode the huffman code for the (HLIT+257) of code lengths of the huffman codes for the literal/length alphabets using the assigned huffman codes, I got the first decoded alphabet as 16,but this is not possible as alphabet 16 is copy the previous code length alphabet,but this is not possible as it is the first decoded alphabet.
Hence there is some error in my understanding of the format and i cannot seem to figure it out and this is where I need help with.
The way you're representing the codes doesn't make sense, since you are showing a bunch of zeros that aren't part of the codes, so you can't tell what their lengths are. Also you are showing the codes in reverse, as compared to how they show up in the bytes.
More importantly, somehow you got the assignments wrong. Here are the correct codes, showing only the bits in the codes, in the correct bit order, with the correct assignments:
00 - 0
010 - 7
110 - 8
001 - 11
0101 - 5
1101 - 6
0011 - 10
1011 - 12
00111 - 9
10111 - 13
001111 - 3
101111 - 4
011111 - 16 (followed by two bits, +3 is the repeat count)
111111 - 17 (followed by three bits, +3 is the zeros count)
The first code length code is 001111
, which says the literal 0 has code length 3.