I have an LZW compressor/decompressor written in C.
The initial table consists of ASCII characters and then each now string to be saved into the table consists of a prefix and a character both saved in a list as int.
My compression works but my decompression leaves some characters out.
The input:
<title>Agile</title><body><h1>Agile</h1></body></html>
The output I get (notice the missing 'e' and '<'):
<title>Agile</title><body><h1>Agil</h1></body>/html>
This is the code I use (the relevant part):
void expand(int * input, int inputSize) {
// int prevcode, currcode
int previousCode; int currentCode;
int nextCode = 256; // start with the same dictionary of 255 characters
dictionaryInit();
// prevcode = read in a code
previousCode = input[0];
int pointer = 1;
// while (there is still data to read)
while (pointer < inputSize) {
// currcode = read in a code
currentCode = input[pointer++];
if (currentCode >= nextCode) printf("!"); // XXX not yet implemented!
currentCode = decode(currentCode);
// add a new code to the string table
dictionaryAdd(previousCode, currentCode, nextCode++);
// prevcode = currcode
previousCode = currentCode;
}
}
int decode(int code) {
int character; int temp;
if (code > 255) { // decode
character = dictionaryCharacter(code);
temp = decode(dictionaryPrefix(code)); // recursion
} else {
character = code; // ASCII
temp = code;
}
appendCharacter(character); // save to output
return temp;
}
Can you spot it? I'd be grateful.
Your decode function returns the first character in the string. You need this character in order to add it to the dictionary, but you should not set previousCode
to it. So your code should look like:
...
firstChar = decode(currentCode);
dictionaryAdd(previousCode, firstChar, nextCode++);
previousCode = currentCode;
...