Whats the fastest UNIX compression utility you know of.

Hi everyone,

Just interested to know everyones opinions on the fastest unix compression utility with okejish compression (doesnt have to have awsome compression). I know of gzip bzip2 (sucks lol) and a couple of others but what is a great one for compression large amounts of data that wont eat to hard into the cpu and is fast.

All opinions are welcome.

Thanks.

Check out Compression Tools Compared

There are plenty names in the graphs listed to test. If it's still too slow for you, think about getting faster CPUs ^^

Thank you :D, Also is there a tool similar to gzip, but when you compress it, it doesnt waste the extra storage space. Because you know how when you gzip a file it keeps the original file and starts creating the gzip one and then deletes the original file when its done. Is there something that doesn't waste this kinda space?

In my eyes, this "waste" of storage is just for your safety. What would you do, if for some reason like a full filesystem, the compression would fail and you got a half eaten source file and a half baked compressed file. At least while the compression is running, you have to have that waste in my eyes.

May I ask for what you need such a fast compression tool? Maybe there is some other way you can handle it if you tell people in here, what you want to achieve or what kind of mechanism is around it.

its actually not waste of space.
Did you mean something like running the compression on the fly - using primary memory and just have what is needed at the end ?

That's kind of not being safe, where you might lose the source.

The problem with gzip is that it is still strictly single-threaded, so it won't scale on a multiprocessor system. A solution to this could be to write a script which "tar"s the source into several distinct tar-files (there should be some mechanism to balance the size of the different files) and gzip these archives simultaneously.

Regarding the "waste of space": it is possible to feed gzip directly via a pipeline:

tar -cf - <source> | gzip -9 > <destination>

should work.

I hope this helps.

bakunin

The original poster wanted to minimize CPU impact, so maybe it's not a problem that gzip would only use 1 CPU. I've found parallel bzip quite handy when I've got a big job to do and don't care about tying up >1 CPU. Its speed scales nearly linearly with the number of CPUs, and you can limit it to something less than all, if you wish.

Cheers,
Eric

gzip seems to be the way to go have looked at many sources can't really beat it. The compression is needed to gzip oracle nightly exports. There are hotbackups aswell and it would be worth the risk for some of the databases for compression to fail. The best solution we have come up with is using zfs disk compression, that saves us about 20 gb on a 50gb file, and since disk is pretty cheap we heading down that road :smiley:

ZFS compression is certainly fast, but it's not gzip (at least not by default IIRC). Later versions have a gzip option, but the default is lzjb, which is almost zero overhead, but of course doesn't compress as well.

I, too, love transparent filesystem compression in ZFS. :smiley: