Solid compression
Encyclopedia
In computing
, solid compression refers to a method for data compression
of multiple files, wherein all the compressed files are concatenated and treated as a single data block. Such an archive is called a solid archive. It is used natively in the 7z
and RAR formats, as well as indirectly in tar
-based formats such as
is not solid because it stores separate compressed files (even though solid compression can be portably emulated).
The term is ostensibly because the data is compressed as a single solid block, rather than as individual files.
(storing multiple files and metadata in a single file). One can combine these in two natural ways:
The order matters (these operations do not commute
), and this latter is solid compression.
In Unix, compression and archiving are traditionally separate operations, which allows one to understand this distinction:
. It is also very efficient when archiving a large number of rather small files.
Later versions of 7-zip use a variable solid block size, so that only a limited amount of data must be processed in order to extract one file.
Parameters control the maximum solid block window size, the number of files in a block, and whether blocks are separated by file extension.
http://www.7-zip.org/history.txt
http://docs.bugaco.com/7zip/MANUAL/switches/method.htm
Additionally, if the archive becomes even slightly damaged, some of the data (sometimes even all data) after the damaged part can be unusable (depending on the compression and archiving format), whereas in a non-solid archive format, usually only one file is unusable and the subsequent files can usually still be extracted.
Computing
Computing is usually defined as the activity of using and improving computer hardware and software. It is the computer-specific part of information technology...
, solid compression refers to a method for data compression
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....
of multiple files, wherein all the compressed files are concatenated and treated as a single data block. Such an archive is called a solid archive. It is used natively in the 7z
7z
7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public...
and RAR formats, as well as indirectly in tar
Tar (file format)
In computing, tar is both a file format and the name of a program used to handle such files...
-based formats such as
.tar.gzGzipGzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...
and .tar.bz2Bzip2bzip2 is a free and open source implementation of the Burrows–Wheeler algorithm. It is developed and maintained by Julian Seward. Seward made the first public release of bzip2, version 0.15, in July 1996.-Compression efficiency:...
.
By contrast, the ZIP formatZIP (file format)
Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...
is not solid because it stores separate compressed files (even though solid compression can be portably emulated).
The term is ostensibly because the data is compressed as a single solid block, rather than as individual files.
Explanation
Compressed file formats often feature both compression (storing the data in a small space) and archivingFile archiver
A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage...
(storing multiple files and metadata in a single file). One can combine these in two natural ways:
- compress the individual files, and then archive into a single file;
- archive into a single data block, and then compress.
The order matters (these operations do not commute
Commutativity
In mathematics an operation is commutative if changing the order of the operands does not change the end result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it...
), and this latter is solid compression.
In Unix, compression and archiving are traditionally separate operations, which allows one to understand this distinction:
- compressing individual files and then archiving would be a
tar
ofgzip
'ed files – this is very uncommon, while - archiving via
tar
and then compressing yields a compressed archive: a.tar.gz
– and this is solid compression.
Benefits
Solid compression allows for much better compression rates when all the files are similar, which is often the case if they are of the same file formatFile format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...
. It is also very efficient when archiving a large number of rather small files.
Costs
On the other hand, getting a single file out of a solid archive originally required processing all the files before it, so modifying solid archives could be slow and inconvenient.Later versions of 7-zip use a variable solid block size, so that only a limited amount of data must be processed in order to extract one file.
Parameters control the maximum solid block window size, the number of files in a block, and whether blocks are separated by file extension.
http://www.7-zip.org/history.txt
http://docs.bugaco.com/7zip/MANUAL/switches/method.htm
Additionally, if the archive becomes even slightly damaged, some of the data (sometimes even all data) after the damaged part can be unusable (depending on the compression and archiving format), whereas in a non-solid archive format, usually only one file is unusable and the subsequent files can usually still be extracted.