Compress - How small can a file before it will create the .Z

system · September 12, 2002, 4:13pm

I am just curious. Compress by itself will not compress a file of zero bytes. (I know gzip does)

Without using compress -f, at what point will compress work. In other words, what is the smallest the file can be before it will create the .Z file?

Some of us here are just wondering...

Thx

Optimus_P · September 12, 2002, 5:00pm

sounds like you can test this one out on your own.

heck you can even write a script to start with a zero byte file and try to compress it then test the return value of the commpress command and if it failed then add 1 byte to the file then try again. adn so on. till you get a match.

look forward to your test results!!

LivinFree · September 13, 2002, 12:09am

I think it has less to do with file size than the "compressibility" of the data inside. I've seen fairly large files not be able to be compressed. Or you could try taking a 100mb file, compressing it to, say, 70mb with gzip, then running compress on it. I doubt compress could further compress the file without actually adding to the file size.

I'm not familiar with compression algorithms', though, so I might be just hot air...

Kelam_Magnus · September 17, 2002, 12:19pm

I agree that the size doesn't matter (ha!).

If you have a file with the value of PI. I believe that it doesn't have any repeating chars or blanks. So it probably wouldn't be compressed at all. However, a file with six bytes that has a space in the middle would be compressed.

I might be somewhat incorrect, but I think most of the compression algorithm has to do with repeating of chars and patterns and blanks, more than anything else.

RTM · September 17, 2002, 5:02pm

Actually, size does matter. And gzip will compress a file no matter what.

From the man page -

Gzip uses the Lempel-Ziv algorithm used in zip and PKZIP.
The amount of compression obtained depends on the size of
the input and the distribution of common substrings. Typi-
cally, text such as source code or English is reduced by
60-70%. Compression is generally much better than that
achieved by LZW (as used in compress), Huffman coding (as
used in pack), or adaptive Huffman coding (compact).

 Compression is always performed, even if the compressed file
 is  slightly larger than the original. The worst case expan-
 sion is a few bytes for the gzip file header, plus  5  bytes
 every  32K  block, or an expansion ratio of 0.015% for large
 files. Note that the  actual  number  of  used  disk  blocks
 almost  never increases.  gzip preserves the mode, ownership
 and timestamps of files when compressing or decompressing.