Search code examples
gifcompressionlzw

Need help decompressing GIF raster data


I have a 10x10 gif that consists of 4 colors, white, red, blue, black. I have parsed the gif data below

4749 4638 3961                      <-- header

0a00 0a00 9100 00                   <-- lsd (pb 91 = 1001 0001) nColors = 4, bytes = 12

ffffff ff0000 0000ff 000000         <-- global color table
                                    #0  FF FF FF
                                    #1  FF 00 00
                                    #2  00 00 FF
                                    #3  00 00 00

21f9 04(00) (0000) (00)00           <-- graphics control extension 
                                    (00) = pb (000 reserved 000 disposal method 0 user input flag 0 transparent color flag) 
                                    (0000) = delay time                     
                                    (00) = transparent color index

2c 0000 0000 0a00 0a00 00           <-- image descriptor

02 16                               <-- (image data - 02 = lzw min code size, 0x16 size of image (bytes))
8c2d 9987 2a1c dc33 a002 75ec 95fa a8de 608c 0491 4c01 00   <-- image block
3b  

Okay so we have our image data above (labeled image block) and I'm trying to decompress this so that I get the original image back. My understanding is that you read bytes left to right and bits right to left starting with lzwmincodesize + 1 (2 + 1 bits = 3 bits at a time).

This is the decompression algorithm I'm following

Initialize code table
let CODE be the first code in the code stream
output {CODE} to index stream
<LOOP POINT>
let CODE be the next code in the code stream
is CODE in the code table?
Yes:
    output {CODE} to index stream
    let K be the first index in {CODE}
    add {CODE-1}+K to the code table
No:
    let K be the first index of {CODE-1}
    output {CODE-1}+K to index stream
    add {CODE-1}+K to code table
return to LOOP POINT

I'm stepping through the decompression algorithm and here's what I've come up with so far... (starting with first 3 byte codes)

Global Color Table
000 FF FF FF
001 FF 00 00
010 00 00 FF
011 00 00 00
100 CLEAR
101 End of Data

 3  2   1   6  5   4      8   7    
10|001|100  0|010|110|1  100|110|01

last        current     output      cindex      exists      dictionary      value
            100                     4                                       CLEAR
100         001         001         1           y                           RED
001         110         001 001     6           n           +001 001        RED RED
001 001     110         001 001     6           y           +001 001 001    RED RED
001 001     010         010         2           y           +001 001 010    BLUE
010         010         010         2           y           +010 010        BLUE
010         110         001 001     6           y           +010 001 001    RED RED
001 001     100                     4                                       CLEAR
100         111         111         7 ???? <--- (what do I do here)?

I should be getting 5 values of red, and then 5 values of blue for the first 10 pixels but as you can see it decodes 5 red then 2 blue then 2 red. Can anybody point out what I'm doing wrong here?

Thanks


Solution

  • Your mistake comes from missing the code size increase. Here are the codes, "nextcode" values and current code size:

    Code read from bitstream:    100, 001, 110, 110, 0010, 1001
    internal 'Next' code value:  110, 110, 110, 111, 1000, 1001
    current code size:             3,   3,   3,   3,    4,    4
    

    The missing logic from your decode loop is that you need to maintain a "nextcode" variable which tells you where to insert codes in your table and when to increase the code size. It starts at the value "clearcode + 2" and increases after each code is read from the bitstream (after the first non CC value). The logic basically looks like this:

    clear_dictionary:
    clearcode = 1<<codestart;
    codesize = codestart+1;
    nextcode = clearcode + 2;
    nextlimit = 1<<(codesize+1);
    oldcode = -1;
    mainloop:
    while (!done)
    {
    code = getCode();
    if (code == clearcode)
       goto clear_dictionary;
    if (oldcode == -1)
    {
       write_code_to_output(code);
    }
    else
    {
       <LZW logic>
       nextcode++;
       if (nextcode >= nextlimit)
       {
          nextlimit <<= 1;
          codesize++;
       }
    }
    oldcode = code;
    } // while !done