Насчёт более эффективной не знаю, но Mark Nelson полагает (1996), что данная схема достаточно эффективна:
RLE input-file | BWT | MTF | RLE | ARI > output-file
A brief description of each of the programs follows:
RLE.CPP This program implements a simple run-length encoder. If the input file has many long runs of identical characters, the sorting procedure in the BWT can be degraded dramatically. The RLE front end prevents that from happening.
BWT.CPP The standard Burrows-Wheeler transform is done here. This program outputs repeated blocks consisting of a block size integer, a copy of L, the primary index, and a special last character index. This is repeated until BWT.EXE runs out of input data.
MTF.CPP The Move to Front encoder operates as described in the previous section.
RLE.CPP The fact that the output file is top-heavy with runs containing zeros means that applying another RLE pass to the output can improve overall compression. I believe that further processing of the MTF output will provide fertile ground for additional improvements.
ARI.CPP This is an order-0 adaptive arithmetic encoder, directly derived from the code published by Witten and Cleary in their 1987 CACM article.
marknelson.us/1996/09/01/bwt
К статье также прилагаются исходные тексты программ на Си++.
А есть исходники BWT и MTF на JS:
https://gist.github.com/SKAhack/14b2dfc4208349f00799