Lbzip2 0.10 (Default branch)

Lbzip2 is a Pthreads-based parallel bzip2/bunzip2 filter, passable to GNU tar with the --use-compress-program option. It isn't restricted to regular files on input, nor output. Successful splitting for decompression isn't guaranteed, just very likely (failure is detected). Splitting in both modes and compression itself occur with an approximate 900k block size. On an Athlon-64 X2 6000+, lbzip2 was 92% faster than standard bzip2 when compressing, and 45% faster when decompressing (based on wall clock time). Lbzip2 strives to be portable by requiring UNIX 98 APIs only, besides an unmodified libbz2.License: GNU General Public License (GPL)Changes:
Testing on a 128-core HP SuperDome showed a known bottleneck in the multiple-workers decompressor to be significant on many-core machines: whenever there were fewer input blocks than cores, the work was distributed unevenly. Hence, the splitter-to-workers queue of "scan and decompress" tasks was replaced with two queues: a low priority, splitter-to-workers one of "scan" tasks, and a high priority, workers-to-workers one of "decompress" tasks. Alas, this also increased the number of context switches. The new worker broadcast conditions were formally proven in the comments.
