Search code examples
gziphuffman-codelz77

Could you explain how to convert from lz77 to huffman?


Could you explain how to convert from lz77 to huffman on the example in the below picture?

example


Solution

  • Easy:

    In the first step your output is essentially 3 numbers:

    1. prev index
    2. number of characters to repeat
    3. next character (be it ascii or unicode)

    The algorithm demands that you specify a sliding window up front. That means you know how big (1) and (2) can be at most. In other words, you know how many bits (1) and (2) will take up. Since (3) is essentially also a character from a fixed length alphabet, you also know the bit-length of (3)

    That means it's safe to simply concatenate them. So, the output of the first algorithm can be thought of as outputting a bit-sequence, where every item in the sequence has a fixed length.

    That's ideal for applying huffman.

    Of course the specifics are not mentioned, and you can choose from a lot of options.

    • normalized huffman table
    • 1 on left-branch vs 0 on left-branch
    • priorities when merging items of similar count
    • etc

    So I can not readily explain the exact output values you are showing. But I hope I can at least explain how to get from A to B.