Search code examples
encodingcompressiontransformationrun-length-encodingburrows-wheeler-transform

Efficient to apply Run-Length Transform after Move to Front Transform and BWT?


I'm new to encoding so I'm trying to understand the basics. I came across a document that was describing a lossless text compression technique, and in this document was a figure that illustrated how their compression works. It works like so:

Source -> BWT -> MTF -> RLT -> Proprietary Entropy Encoder

I don't understand why they would use Run-Length Transform after Move to Front Transform, it doesn't seem efficient to me. As I understand it, MTF does not produce many runs itself and therefore it would not be useful to use RLT afterwords.

Some explanation would be much appreciated!


Solution

  • It's not efficient.

    It's probably an old paper on BWT. Back then, using RLE (run-length Encoder) was a common trick to improve speed. RLE, btw, was not only used after BWT, but also before, in order to speed up BWT stage. The same logic could apply to the entropy coder, but it is unlikely to provide as much benefit.

    Nowadays, this trick are completely out dated. Before BWT, you might find some kind of LZP pre-processing, especially for long range matches (beyond block size), with about the same intend as RLE regarding speed improvement, but more powerful. MTF is also completely replaced, since it is too CPU intensive and not as efficient for its cost.