File Compression methods

Rich rich at dranek.com
Thu Oct 10 10:48:09 MDT 2013


On Thu, Oct 10, 2013 at 01:55:03PM +0530, Dan Egli wrote:
>But I know that bzip2 is not the best compressor anymore. It's not
>too bad, but there are better ones. So I ask what you guys would recommend
>as the compression system? The only restriction I have on it is that it
>must be able to either handle the peculiarities of Unix vs. Dos/Windows
>systems (i.e. ownerships, permissions, device files, and symlinks, like
>tar) OR it must be able to compress from/decompress to stdin/stdout (like
>bzip2). 

xz usually can compress a little more than bzip2. The binary packages of 
Arch linux (for example) are distributed in .tar.xz, and they migrated 
to that a few years ago after .tar.gz files I think. Of course, xz also 
is slower to compress than bzip2.

>And a two
>step process is unfortunately out of the question. The machines will only
>have either 750GB or 1TB hdds, which obviously won't work for extracting
>the tar to disk then extracting from the tar on disk. tar's extraction
>process would run out of space before it finished.

A "two-step process" for extracting compressed tar files (be they 
.tar.gz, .tar.bz2, .tar.xz, or anything else) is entirely unnessary. tar 
can decompress and extract the tarball simultaneously (check the man 
page for the "-z", "-j", "-J" options; with modern versions of tar you 
don't even have to specify these flags, a "tar xf data.tar.xz" will do 
it all for you). And if you didn't want to do it that way, you could set 
up the pipeline yourself with:

xz -d file.tar.xz -c | tar xf -

And if you want to get fancy and see the progress, use the "pv" (pipe 
viewer) command:

pv file.tar.xz | xz -d -c | tar xf -

-- 
Rich


More information about the PLUG mailing list