Data compression
Encyclopedia
In computer science
and information theory
, data compression, source coding or bit-rate reduction is the process of encoding information
using fewer bit
s than the original representation would use.
Compression is useful because it helps reduce the consumption of expensive resources, such as hard disk
space or transmission bandwidth
. On the downside, compressed data must be decompressed to be used, and this extra processing may be detrimental to some applications. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed (the option of decompressing the video in full before watching it may be inconvenient, and requires storage space for the decompressed video). The design of data compression schemes therefore involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (if using a lossy compression scheme
), and the computational resources required to compress and uncompress the data. Compression was one of the main drivers for the growth of information during the past two decades.
compression algorithms usually exploit statistical redundancy in such a way as to represent the sender's data more concisely without error. Lossless compression is possible because most real-world data has statistical redundancy. For example, in English text, the letter 'e' is much more common than the letter 'z', and the probability that the letter 'q' will be followed by the letter 'z' is very small.
Another kind of compression, called lossy data compression
or perceptual coding, is possible if some loss of fidelity
is acceptable. Generally, a lossy data compression will be guided by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in luminance
than it is to variations in color. JPEG
image compression works in part by "rounding off" some of this less-important information. Lossy data compression provides a way to obtain the best fidelity for a given amount of compression.
is used in digital camera
s, to increase storage capacities with minimal degradation of picture quality. Similarly, DVD
s use the lossy MPEG-2
Video codec
for video compression.
In lossy audio compression, methods of psychoacoustics
are used to remove non-audible (or less audible) components of the signal
. Compression of human speech is often performed with even more specialized techniques, so that "speech compression
" or "voice coding" is sometimes distinguished as a separate discipline from "audio compression". Different audio and speech compression standards are listed under audio codec
s. Voice compression is used in Internet telephony for example, while audio compression is used for CD ripping and is decoded by audio players.
, gzip
and PNG. LZW
(Lempel–Ziv–Welch) is used in GIF images. Also noteworthy are the LZR (LZ–Renau) methods, which serve as the basis of the Zip method. LZ methods utilize a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded
(e.g. SHRI, LZX).
A current LZ-based coding scheme that performs well is LZX
, used in Microsoft's CAB
format.
The very best modern lossless compressors use probabilistic models, such as prediction by partial matching. The Burrows–Wheeler transform can also be viewed as an indirect form of statistical modelling.
In a further refinement of these techniques, statistical predictions can be coupled to an algorithm called arithmetic coding
. Arithmetic coding, invented by Jorma Rissanen
, and turned into a practical method by Witten, Neal, and Cleary, achieves superior compression to the better-known Huffman algorithm, and lends itself especially well to adaptive data compression tasks where the predictions are strongly context-dependent. Arithmetic coding is used in the bilevel image-compression standard JBIG
, and the document-compression standard DjVu
. The text entry system, Dasher
, is an inverse-arithmetic-coder.
(which is closely related to algorithmic information theory
) for lossless compression, and by rate–distortion theory for lossy compression. These fields of study were essentially created by Claude Shannon, who published fundamental papers on the topic in the late 1940s and early 1950s. Coding theory
is also related. The idea of data compression is deeply connected with statistical inference
.
and compression: a system that predicts the posterior probabilities of a sequence given its entire history can be used for optimal data compression (by using arithmetic coding on the output distribution), while an optimal compressor can be used for prediction (by finding the symbol that compresses best, given the previous history). This equivalence has been used as justification for data compression as a benchmark for "general intelligence".
: data differencing consists of producing a difference given a source and a target, with patching producing a target given a source and a difference, while data compression consists of producing a compressed file given a target, and decompression consists of producing a target given only a compressed file. Thus, one can consider data compression as data differencing with empty source data, the compressed file corresponding to a "difference from nothing". This is the same as considering absolute entropy (corresponding to data compression) as a special case of relative entropy (corresponding to data differencing) with no initial data.
When one wishes to emphasize the connection, one may use the term differential compression to refer to data differencing.
s are implemented in computer software
as audio codec
s. Generic data compression algorithms perform poorly with audio data, seldom reducing data size much below 87% from the original, and are not designed for use in real time applications. Consequently, specifically optimized audio lossless
and lossy
algorithms have been created. Lossy algorithms provide greater compression rates and are used in mainstream consumer audio devices.
In both lossy and lossless compression, information redundancy
is reduced, using methods such as coding
, pattern recognition
and linear prediction
to reduce the amount of information used to represent the uncompressed data.
The trade-off between slightly reduced audio quality and transmission or storage size is outweighed by the latter for most practical audio applications in which users may not perceive the loss in playback rendition quality. For example, one Compact Disc
holds approximately one hour of uncompressed high fidelity music, less than 2 hours of music compressed losslessly, or 7 hours of music compressed in the MP3
format at medium bit rate
s.
and MP3
. Compression ratios are similar to those for generic lossless data compression (around 50–60% of original size ), and substantially less than for lossy compression, which typically yield 5–20% of original size.
The primary application areas of lossless encoding are:
Archive
s: For archival purposes it is generally desired to preserve the source material exactly (i.e., at 'best possible quality').
Editing: Audio engineers use lossless compression for audio editing to avoid digital generation loss.
High fidelity playback: Audiophile
s prefer lossless compression formats to avoid compression artifact
s.
Creating master copies for mass-produced audio: High quality losslessly compressed master copies of recordings are used to produce lossily compressed versions for digital audio players. As formats and encoders improve, updated lossily compressed files may be generated from the lossless master.
Shorten
was an early lossless format; newer ones include Free Lossless Audio Codec (FLAC), Apple's Apple Lossless
, MPEG-4 ALS, Windows Media Audio 9 Lossless
(WMA Lossless), Monkey's Audio
, and TTA
. See list of lossless codecs for a complete list.
Some audio formats feature a combination of a lossy format and a lossless correction; this allows stripping the correction to easily obtain a lossy file. Such formats include MPEG-4 SLS
(Scalable to Lossless), WavPack
, and OptimFROG DualStream.
Some formats are associated with a technology, such as:
It is difficult to maintain all the data in an audio stream and achieve substantial compression. First, the vast majority of sound recordings are highly complex, recorded from the real world. As one of the key methods of compression is to find patterns and repetition, more chaotic data such as audio doesn't compress well. In a similar manner, photograph
s compress less efficiently with lossless methods than simpler computer-generated images do. But interestingly, even computer generated sounds can contain very complicated waveform
s that present a challenge to many compression algorithms. This is due to the nature of audio waveforms, which are generally difficult to simplify without a (necessarily lossy) conversion to frequency information, as performed by the human ear.
The second reason is that values of audio samples change very quickly, so generic data compression algorithm
s don't work well for audio, and strings of consecutive bytes don't generally appear very often. However, convolution
with the filter [-1 1] (that is, taking the first derivative) tends to slightly whiten
(decorrelate
, make flat) the spectrum, thereby allowing traditional lossless compression at the encoder to do its job; integration at the decoder restores the original signal. Codecs such as FLAC
, Shorten
and TTA
use linear prediction
to estimate the spectrum of the signal. At the encoder, the estimator's inverse is used to whiten the signal by removing spectral peaks while the estimator is used to reconstruct the original signal at the decoder.
Lossless audio codecs have no quality issues, so the usability can be estimated by
; satellite and cable radio; and increasingly in terrestrial radio broadcasts. Lossy compression typically achieves far greater compression than lossless compression (data of 5 percent to 20 percent of the original stream, rather than 50 percent to 60 percent), by discarding less-critical data.
The innovation of lossy audio compression was to use psychoacoustics
to recognize that not all data in an audio stream can be perceived by the human auditory system. Most lossy compression reduces perceptual redundancy by first identifying sounds which are considered perceptually irrelevant, that is, sounds that are very hard to hear. Typical examples include high frequencies, or sounds that occur at the same time as louder sounds. Those sounds are coded with decreased accuracy or not coded at all.
Due to the nature of lossy algorithms, audio quality suffers when a file is decompressed and recompressed (digital generation loss). This makes lossy compression unsuitable for storing the intermediate results in professional audio engineering applications, such as sound editing and multitrack recording. However, they are very popular with end users (particularly MP3
), as a megabyte can store about a minute's worth of music at adequate quality.
In order to determine what information in an audio signal is perceptually irrelevant, most lossy compression algorithms use transforms such as the modified discrete cosine transform
(MDCT) to convert time domain
sampled waveforms into a transform domain. Once transformed, typically into the frequency domain
, component frequencies can be allocated bits according to how audible they are. Audibility of spectral components is determined by first calculating a masking threshold
, below which it is estimated that sounds will be beyond the limits of human perception.
The masking threshold is calculated using the absolute threshold of hearing
and the principles of simultaneous masking
—the phenomenon wherein a signal is masked by another signal separated by frequency, and, in some cases, temporal masking
—where a signal is masked by another signal separated by time. Equal-loudness contour
s may also be used to weight the perceptual importance of different components. Models of the human ear-brain combination incorporating such effects are often called psychoacoustic models.
Other types of lossy compressors, such as the linear predictive coding
(LPC) used with speech, are source-based coders. These coders use a model of the sound's generator (such as the human vocal tract with LPC) to whiten the audio signal (i.e., flatten its spectrum) prior to quantization. LPC may also be thought of as a basic perceptual coding technique; reconstruction of an audio signal using a linear predictor shapes the coder's quantization noise into the spectrum of the target signal, partially masking it.
Usability of lossy audio codecs is determined by:
Lossy formats are often used for the distribution of streaming audio, or interactive applications (such as the coding of speech for digital transmission in cell phone networks). In such applications, the data must be decompressed as the data flows, rather than after the entire data stream has been transmitted. Not all audio codecs can be used for streaming applications, and for such applications a codec designed to stream data effectively will usually be chosen.
Latency results from the methods used to encode and decode the data. Some codecs will analyze a longer segment of the data to optimize efficiency, and then code it in a manner that requires a larger segment of data at one time in order to decode. (Often codecs create segments called a "frame" to create discrete data segments for encoding and decoding.) The inherent latency
of the coding algorithm can be critical; for example, when there is two-way transmission of data, such as with a telephone conversation, significant delays may seriously degrade the perceived quality.
In contrast to the speed of compression, which is proportional to the number of operations required by the algorithm, here latency refers to the number of samples which must be analysed before a block of audio is processed. In the minimum case, latency is 0 zero samples (e.g., if the coder/decoder simply reduces the number of bits used to quantize the signal). Time domain algorithms such as LPC also often have low latencies, hence their popularity in speech coding for telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed in order to implement a psychoacoustic model in the frequency domain, and latency is on the order of 23 ms (46 ms for two-way communication).
Speech encoding
is an important category of audio data compression. The perceptual models used to estimate what a human ear can hear are generally somewhat different from those used for music. The range of frequencies needed to convey the sounds of a human voice are normally far narrower than that needed for music, and the sound is normally less complex. As a result, speech can be encoded at high quality using relatively low bit rates.
This is accomplished, in general, by some combination of two approaches:
Perhaps the earliest algorithms used in speech encoding (and audio data compression in general) were the A-law algorithm
and the µ-law algorithm.
The world's first commercial broadcast automation audio compression system was developed by Oscar Bonello, an Engineering professor at the University of Buenos Aires
. In 1983, using the psychoacoustic principle of the masking of critical bands first published in 1967, he started developing a practical application based on the recently developed IBM PC
computer, and the broadcast automation system was launched in 1987 under the name Audicom
. 20 years later, almost all the radio stations in the world were using similar technology, manufactured by a number of companies.
and temporal motion compensation
. Video compression is an example of the concept of source coding
in Information theory
. Compressed video can effectively reduce the bandwidth
required to transmit video
via terrestrial broadcast
, via cable TV, or via satellite TV services.
s use a video coding standard called MPEG-2
that can compress video data by 15 to 30 times, while still producing a picture quality
that is generally considered high-quality for standard-definition video. Video compression is a tradeoff between disk space, video quality, and the cost of hardware
required to decompress the video in a reasonable time. However, if the video is overcompressed in a lossy manner, visible (and sometimes distracting) artifacts
can appear.
Video compression typically operates on square-shaped groups of neighboring pixel
s, often called macroblock
s. These pixel groups or blocks of pixels are compared from one frame to the next and the video compression codec
(encode/decode scheme) sends only the differences
within those blocks. This works extremely well if the video has no motion. A still frame of text, for example, can be repeated with very little transmitted data. In areas of video with more motion, more pixels change from one frame to the next. When more pixels change, the video compression scheme must send more data to keep up with the larger number of pixels that are changing. If the video content includes an explosion, flames, a flock of thousands of birds, or any other image with a great deal of high-frequency detail, the quality will decrease, or the variable bitrate
must be increased to render this added information with the same level of detail.
The programming provider has control over the amount of video compression applied to their video programming before it is sent to their distribution system. DVDs, Blu-ray discs, and HD DVD
s have video compression applied during their mastering process, though Blu-ray and HD DVD have enough disc capacity that most compression applied in these formats is light, when compared to such examples as most video streamed on the internet
, or taken on a cellphone. Software used for storing video on hard drives or various optical disc formats will often have a lower image quality. High-bitrate video codecs with little or no compression exist for video post-production
work, but create very large files and are thus almost never used for the distribution of finished videos. Once excessive lossy video compression compromises image quality, it is impossible to restore the image to its original quality.
pixel
s. Two dimensions serve as spatial (horizontal and vertical) directions of the moving pictures, and one dimension represents the time domain
. A data frame
is a set of all pixels that correspond to a single time moment. Basically, a frame is the same as a still picture
.
Video data contains spatial and temporal redundancy
. Similarities can thus be encoded by merely registering differences within a frame (spatial), and/or between frames (temporal). Spatial encoding is performed by taking advantage of the fact that the human eye is unable to distinguish small differences in color as easily as it can perceive changes in brightness, so that very similar areas of color can be "averaged out" in a similar way to jpeg images. With temporal compression only the changes from one frame to the next are encoded as often a large number of the pixels will be the same on a series of frames.
. This means that when the data is decompressed, the result is a bit-for-bit perfect match with the original. While lossless compression of video is possible, it is rarely used, as lossy compression results in far higher compression ratios at an acceptable level of quality.
.
The most commonly used method works by comparing each frame in the video with the previous one. If the frame contains areas where nothing has moved, the system simply issues a short command that copies that part of the previous frame, bit-for-bit, into the next one. If sections of the frame move in a simple manner, the compressor emits a (slightly longer) command that tells the decompresser to shift, rotate, lighten, or darken the copy: a longer command, but still much shorter than intraframe compression. Interframe compression works well for programs that will simply be played back by the viewer, but can cause problems if the video sequence needs to be edited.
Since interframe compression copies data from one frame to another, if the original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Some video formats, such as DV
, compress each frame independently using intraframe compression. Making 'cuts' in intraframe-compressed video is almost as easy as editing uncompressed video: one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another difference between intraframe and interframe compression is that with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as "I frames" in MPEG-2
) aren't allowed to copy data from other frames, and so require much more data than other frames nearby.
It is possible to build a computer-based video editor that spots problems caused when I frames are edited out while other frames need them. This has allowed newer formats like HDV
to be used for editing. However, this process demands a lot more computing power than editing intraframe compressed video with the same picture quality.
or ISO
) apply a discrete cosine transform
(DCT) for spatial redundancy reduction. Other methods, such as fractal compression
, matching pursuit
and the use of a discrete wavelet transform
(DWT) have been the subject of some research, but are typically not used in practical products (except for the use of wavelet coding as still-image coders without motion compensation). Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness of such methods.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
and information theory
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...
, data compression, source coding or bit-rate reduction is the process of encoding information
Information
Information in its most restricted technical sense is a message or collection of messages that consists of an ordered sequence of symbols, or it is the meaning that can be interpreted from such a message or collection of messages. Information can be recorded or transmitted. It can be recorded as...
using fewer bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...
s than the original representation would use.
Compression is useful because it helps reduce the consumption of expensive resources, such as hard disk
Hard disk
A hard disk drive is a non-volatile, random access digital magnetic data storage device. It features rotating rigid platters on a motor-driven spindle within a protective enclosure. Data is magnetically read from and written to the platter by read/write heads that float on a film of air above the...
space or transmission bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...
. On the downside, compressed data must be decompressed to be used, and this extra processing may be detrimental to some applications. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it is being decompressed (the option of decompressing the video in full before watching it may be inconvenient, and requires storage space for the decompressed video). The design of data compression schemes therefore involves trade-offs among various factors, including the degree of compression, the amount of distortion introduced (if using a lossy compression scheme
Lossy data compression
In information technology, "lossy" compression is a data encoding method that compresses data by discarding some of it. The procedure aims to minimize the amount of data that need to be held, handled, and/or transmitted by a computer...
), and the computational resources required to compress and uncompress the data. Compression was one of the main drivers for the growth of information during the past two decades.
Lossless versus lossy compression
LosslessLossless data compression
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange...
compression algorithms usually exploit statistical redundancy in such a way as to represent the sender's data more concisely without error. Lossless compression is possible because most real-world data has statistical redundancy. For example, in English text, the letter 'e' is much more common than the letter 'z', and the probability that the letter 'q' will be followed by the letter 'z' is very small.
Another kind of compression, called lossy data compression
Lossy data compression
In information technology, "lossy" compression is a data encoding method that compresses data by discarding some of it. The procedure aims to minimize the amount of data that need to be held, handled, and/or transmitted by a computer...
or perceptual coding, is possible if some loss of fidelity
Fidelity
"Fidelity" is the quality of being faithful or loyal. Its original meaning regarded duty to a lord or a king, in a broader sense than the related concept of fealty. Both derive from the Latin word fidēlis, meaning "faithful or loyal"....
is acceptable. Generally, a lossy data compression will be guided by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in luminance
Luminance
Luminance is a photometric measure of the luminous intensity per unit area of light travelling in a given direction. It describes the amount of light that passes through or is emitted from a particular area, and falls within a given solid angle. The SI unit for luminance is candela per square...
than it is to variations in color. JPEG
JPEG
In computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....
image compression works in part by "rounding off" some of this less-important information. Lossy data compression provides a way to obtain the best fidelity for a given amount of compression.
Lossy
Lossy image compressionImage compression
The objective of image compression is to reduce irrelevance and redundancy of the image data in order to be able to store or transmit data in an efficient form.- Lossy and lossless compression :...
is used in digital camera
Digital camera
A digital camera is a camera that takes video or still photographs, or both, digitally by recording images via an electronic image sensor. It is the main device used in the field of digital photography...
s, to increase storage capacities with minimal degradation of picture quality. Similarly, DVD
DVD
A DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions....
s use the lossy MPEG-2
MPEG-2
MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods which permit storage and transmission of movies using currently available storage media and transmission...
Video codec
Video codec
A video codec is a device or software that enables video compression and/or decompression for digital video. The compression usually employs lossy data compression. Historically, video was stored as an analog signal on magnetic tape...
for video compression.
In lossy audio compression, methods of psychoacoustics
Psychoacoustics
Psychoacoustics is the scientific study of sound perception. More specifically, it is the branch of science studying the psychological and physiological responses associated with sound...
are used to remove non-audible (or less audible) components of the signal
Audio signal processing
Audio signal processing, sometimes referred to as audio processing, is the intentional alteration of auditory signals, or sound. As audio signals may be electronically represented in either digital or analog format, signal processing may occur in either domain...
. Compression of human speech is often performed with even more specialized techniques, so that "speech compression
Speech encoding
Speech coding is the application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting...
" or "voice coding" is sometimes distinguished as a separate discipline from "audio compression". Different audio and speech compression standards are listed under audio codec
Audio codec
All codecs are devices or computer programs capable of coding or decoding a digital data stream or signal.The term audio codec has two meanings depending on the context:...
s. Voice compression is used in Internet telephony for example, while audio compression is used for CD ripping and is decoded by audio players.
Lossless
The Lempel–Ziv (LZ) compression methods are among the most popular algorithms for lossless storage. DEFLATE is a variation on LZ which is optimized for decompression speed and compression ratio, but compression can be slow. DEFLATE is used in PKZIPPKZIP
PKZIP is an archiving tool originally written by Phil Katz and marketed by his company PKWARE, Inc. The common "PK" prefix used in both PKZIP and PKWARE stands for "Phil Katz".-History:...
, gzip
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...
and PNG. LZW
LZW
Lempel–Ziv–Welch is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978...
(Lempel–Ziv–Welch) is used in GIF images. Also noteworthy are the LZR (LZ–Renau) methods, which serve as the basis of the Zip method. LZ methods utilize a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded
Huffman coding
In computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variable-length code table for encoding a source symbol where the variable-length code table has been derived in a particular way based on...
(e.g. SHRI, LZX).
A current LZ-based coding scheme that performs well is LZX
LZX (algorithm)
LZX is the name of an LZ77 family compression algorithm. It is also the name of a file archiver with the same name. Both were invented by Jonathan Forbes and Tomi Poutanen.-Amiga LZX:...
, used in Microsoft's CAB
Cabinet (file format)
In computing, CAB is the Microsoft Windows native compressed archive format. It supports compression and digital signing, and is used in a variety of Microsoft installation engines: Setup API, Device Installer, AdvPack and Windows Installer.Though Cabinet was originally called Diamond, its .CAB...
format.
The very best modern lossless compressors use probabilistic models, such as prediction by partial matching. The Burrows–Wheeler transform can also be viewed as an indirect form of statistical modelling.
In a further refinement of these techniques, statistical predictions can be coupled to an algorithm called arithmetic coding
Arithmetic coding
Arithmetic coding is a form of variable-length entropy encoding used in lossless data compression. Normally, a string of characters such as the words "hello there" is represented using a fixed number of bits per character, as in the ASCII code...
. Arithmetic coding, invented by Jorma Rissanen
Jorma Rissanen
Jorma J. Rissanen is an information theorist, known for inventing the arithmetic coding technique of lossless data compression, and the minimum description length principle....
, and turned into a practical method by Witten, Neal, and Cleary, achieves superior compression to the better-known Huffman algorithm, and lends itself especially well to adaptive data compression tasks where the predictions are strongly context-dependent. Arithmetic coding is used in the bilevel image-compression standard JBIG
JBIG
JBIG is a lossless image compression standard from the Joint Bi-level Image Experts Group, standardized as ISO/IEC standard 11544 and as ITU-T recommendation T.82. It is widely implemented in fax machines. Now that the newer bi-level image compression standard JBIG2 has been released, JBIG is also...
, and the document-compression standard DjVu
DjVu
DjVu is a computer file format designed primarily to store scanned documents, especially those containing a combination of text, line drawings, and photographs. It uses technologies such as image layer separation of text and background/images, progressive loading, arithmetic coding, and lossy...
. The text entry system, Dasher
Dasher
Dasher is a computer accessibility tool which enables users to write without using a keyboard, by entering text on a screen using a pointing device such as a mouse, a touchpad, a touch screen, a roller ball, a joystick, a Push-button, a Wii Remote, or even mice operated by the foot or head...
, is an inverse-arithmetic-coder.
Theory
The theoretical background of compression is provided by information theoryInformation theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...
(which is closely related to algorithmic information theory
Algorithmic information theory
Algorithmic information theory is a subfield of information theory and computer science that concerns itself with the relationship between computation and information...
) for lossless compression, and by rate–distortion theory for lossy compression. These fields of study were essentially created by Claude Shannon, who published fundamental papers on the topic in the late 1940s and early 1950s. Coding theory
Coding theory
Coding theory is the study of the properties of codes and their fitness for a specific application. Codes are used for data compression, cryptography, error-correction and more recently also for network coding...
is also related. The idea of data compression is deeply connected with statistical inference
Statistical inference
In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...
.
Machine learning
There is a close connection between machine learningMachine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...
and compression: a system that predicts the posterior probabilities of a sequence given its entire history can be used for optimal data compression (by using arithmetic coding on the output distribution), while an optimal compressor can be used for prediction (by finding the symbol that compresses best, given the previous history). This equivalence has been used as justification for data compression as a benchmark for "general intelligence".
Data differencing
Data compression can be viewed as a special case of data differencingData differencing
In computer science and information theory, data differencing or differential compression is producing a technical description of the difference between two sets of data – a source and a target...
: data differencing consists of producing a difference given a source and a target, with patching producing a target given a source and a difference, while data compression consists of producing a compressed file given a target, and decompression consists of producing a target given only a compressed file. Thus, one can consider data compression as data differencing with empty source data, the compressed file corresponding to a "difference from nothing". This is the same as considering absolute entropy (corresponding to data compression) as a special case of relative entropy (corresponding to data differencing) with no initial data.
When one wishes to emphasize the connection, one may use the term differential compression to refer to data differencing.
Outlook and currently unused potential
It is estimated that the total amount of the information that is stored on the world's storage devices could be furthermore compressed by a remaining average factor of 4.5 : 1 with existing compression algorithms, which means that thanks to compression, the world could store 4.5 times more information on its existing storage devices than it currently does (it is estimated that the combined technological capacity of the world to store information provides 1,300 exabytes of hardware digits in 2007, but when the corresponding content is optimally compressed, this only represents 295 exabytes of Shannon information).Audio
Audio compression is designed to reduce the transmission bandwidth requirement of digital audio streams and the storage size of audio files. Audio compression algorithmAlgorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
s are implemented in computer software
Computer software
Computer software, or just software, is a collection of computer programs and related data that provide the instructions for telling a computer what to do and how to do it....
as audio codec
Audio codec
All codecs are devices or computer programs capable of coding or decoding a digital data stream or signal.The term audio codec has two meanings depending on the context:...
s. Generic data compression algorithms perform poorly with audio data, seldom reducing data size much below 87% from the original, and are not designed for use in real time applications. Consequently, specifically optimized audio lossless
Lossless data compression
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange...
and lossy
Lossy data compression
In information technology, "lossy" compression is a data encoding method that compresses data by discarding some of it. The procedure aims to minimize the amount of data that need to be held, handled, and/or transmitted by a computer...
algorithms have been created. Lossy algorithms provide greater compression rates and are used in mainstream consumer audio devices.
In both lossy and lossless compression, information redundancy
Redundancy (information theory)
Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Informally, it is the amount of wasted "space" used to transmit certain data...
is reduced, using methods such as coding
Coding theory
Coding theory is the study of the properties of codes and their fitness for a specific application. Codes are used for data compression, cryptography, error-correction and more recently also for network coding...
, pattern recognition
Pattern recognition
In machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...
and linear prediction
Linear prediction
Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples....
to reduce the amount of information used to represent the uncompressed data.
The trade-off between slightly reduced audio quality and transmission or storage size is outweighed by the latter for most practical audio applications in which users may not perceive the loss in playback rendition quality. For example, one Compact Disc
Compact Disc
The Compact Disc is an optical disc used to store digital data. It was originally developed to store and playback sound recordings exclusively, but later expanded to encompass data storage , write-once audio and data storage , rewritable media , Video Compact Discs , Super Video Compact Discs ,...
holds approximately one hour of uncompressed high fidelity music, less than 2 hours of music compressed losslessly, or 7 hours of music compressed in the MP3
MP3
MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...
format at medium bit rate
Bit rate
In telecommunications and computing, bit rate is the number of bits that are conveyed or processed per unit of time....
s.
Lossless audio compression
Lossless audio compression produces a representation of digital data that can be expanded to an exact digital duplicate of the original audio stream. This is in contrast to the irreversible changes upon playback from lossy compression techniques such as VorbisVorbis
Vorbis is a free software / open source project headed by the Xiph.Org Foundation . The project produces an audio format specification and software implementation for lossy audio compression...
and MP3
MP3
MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...
. Compression ratios are similar to those for generic lossless data compression (around 50–60% of original size ), and substantially less than for lossy compression, which typically yield 5–20% of original size.
The primary application areas of lossless encoding are:
Archive
Archive
An archive is a collection of historical records, or the physical place they are located. Archives contain primary source documents that have accumulated over the course of an individual or organization's lifetime, and are kept to show the function of an organization...
s: For archival purposes it is generally desired to preserve the source material exactly (i.e., at 'best possible quality').
Editing: Audio engineers use lossless compression for audio editing to avoid digital generation loss.
High fidelity playback: Audiophile
Audiophile
An audiophile is a person who enjoys listening to recorded music, usually in a home. Some audiophiles are more interested in collecting and listening to music, while others are more interested in collecting and listening to audio components, whose "sound quality" they consider as important as the...
s prefer lossless compression formats to avoid compression artifact
Compression artifact
A compression artifact is a noticeable distortion of media caused by the application of lossy data compression....
s.
Creating master copies for mass-produced audio: High quality losslessly compressed master copies of recordings are used to produce lossily compressed versions for digital audio players. As formats and encoders improve, updated lossily compressed files may be generated from the lossless master.
- As file storage and communications bandwidth have become less expensive and more available, lossless audio compression has become more popular.
Formats
Shorten
Shorten
Shorten is a file format used for compressing audio data. It is a form of data compression of files and is used to losslessly compress CD-quality audio files . Shorten is no longer developed and more recent lossless audio codecs such as FLAC, Monkey's Audio , TTA, and WavPack have become more...
was an early lossless format; newer ones include Free Lossless Audio Codec (FLAC), Apple's Apple Lossless
Apple Lossless
Apple Lossless Apple Lossless Apple Lossless (also known as ALAC (Apple Lossless Audio Codec), or ALE (Apple Lossless Encoder) is an audio codec developed by Apple Inc. for lossless data compression of digital music. After initially being proprietary for many years, in late 2011 Apple open sourced...
, MPEG-4 ALS, Windows Media Audio 9 Lossless
Windows Media Audio 9 Lossless
Windows Media Audio 9 Lossless is a lossless audio codec by Microsoft, released in early 2003.It compresses an audio CD to a range of 206 to 411MB, at bit rates of 470 to 940 kbit/s. The result is a bit-for-bit duplicate of the original audio file; in other words, the audio quality on the CD will...
(WMA Lossless), Monkey's Audio
Monkey's Audio
Monkey's Audio is a file format for audio data compression. Being a lossless format, Monkey's Audio does not discard data during the process of encoding, unlike lossy compression methods such as AAC, MP3, Vorbis and Musepack....
, and TTA
TTA (codec)
True Audio is a free software, real-time lossless audio codec, based on adaptive prognostic filters.Also, .tta is the generic extension to filenames of audio files created by True Audio codec.- Codec overview :...
. See list of lossless codecs for a complete list.
Some audio formats feature a combination of a lossy format and a lossless correction; this allows stripping the correction to easily obtain a lossy file. Such formats include MPEG-4 SLS
MPEG-4 SLS
MPEG-4 SLS, or MPEG-4 Scalable to Lossless as per ISO/IEC 14496-3:2005/Amd 3:2006 , is an extension to the MPEG-4 Part 3 standard to allow lossless audio compression scalable to lossy MPEG-4 General Audio coding methods...
(Scalable to Lossless), WavPack
WavPack
WavPack is a free, open source lossless audio compression format developed by David Bryant.-Features:WavPack compression can compress 8-, 16-, 24-, and 32-bit fixed-point, and 32-bit floating point audio files in the .WAV file format. It also supports surround sound streams and high frequency...
, and OptimFROG DualStream.
Some formats are associated with a technology, such as:
- Direct Stream Transfer, used in Super Audio CDSuper Audio CDSuper Audio CD is a high-resolution, read-only optical disc for audio storage. Sony and Philips Electronics jointly developed the technology, and publicized it in 1999. It is designated as the Scarlet Book standard. Sony and Philips previously collaborated to define the Compact Disc standard...
- Meridian Lossless PackingMeridian Lossless PackingMeridian Lossless Packing, also known as Packed PCM , is a proprietary lossless compression technique for compressing PCM audio data developed by Meridian Audio, Ltd. MLP is the standard lossless compression method for DVD-Audio content and typically provides about 1.5:1 compression on most music...
, used in DVD-AudioDVD-AudioDVD-Audio is a digital format for delivering high-fidelity audio content on a DVD. DVD-Audio is not intended to be a video delivery format and is not the same as video DVDs containing concert films or music videos....
, Dolby TrueHDDolby TrueHDDolby TrueHD is an advanced lossless multi-channel audio codec developed by Dolby Laboratories which is intended primarily for high-definition home-entertainment equipment such as Blu-ray Disc and HD DVD. It is the successor to the AC-3 Dolby Digital surround sound codec which was used as the...
, Blu-ray and HD DVDHD DVDHD DVD is a discontinued high-density optical disc format for storing data and high-definition video.Supported principally by Toshiba, HD DVD was envisioned to be the successor to the standard DVD format...
Difficulties in lossless compression of audio data
It is difficult to maintain all the data in an audio stream and achieve substantial compression. First, the vast majority of sound recordings are highly complex, recorded from the real world. As one of the key methods of compression is to find patterns and repetition, more chaotic data such as audio doesn't compress well. In a similar manner, photograph
Photograph
A photograph is an image created by light falling on a light-sensitive surface, usually photographic film or an electronic imager such as a CCD or a CMOS chip. Most photographs are created using a camera, which uses a lens to focus the scene's visible wavelengths of light into a reproduction of...
s compress less efficiently with lossless methods than simpler computer-generated images do. But interestingly, even computer generated sounds can contain very complicated waveform
Waveform
Waveform means the shape and form of a signal such as a wave moving in a physical medium or an abstract representation.In many cases the medium in which the wave is being propagated does not permit a direct visual image of the form. In these cases, the term 'waveform' refers to the shape of a graph...
s that present a challenge to many compression algorithms. This is due to the nature of audio waveforms, which are generally difficult to simplify without a (necessarily lossy) conversion to frequency information, as performed by the human ear.
The second reason is that values of audio samples change very quickly, so generic data compression algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
s don't work well for audio, and strings of consecutive bytes don't generally appear very often. However, convolution
Convolution
In mathematics and, in particular, functional analysis, convolution is a mathematical operation on two functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions. Convolution is similar to cross-correlation...
with the filter [-1 1] (that is, taking the first derivative) tends to slightly whiten
White noise
White noise is a random signal with a flat power spectral density. In other words, the signal contains equal power within a fixed bandwidth at any center frequency...
(decorrelate
Decorrelation
Decorrelation is a general term for any process that is used to reduce autocorrelation within a signal, or cross-correlation within a set of signals, while preserving other aspects of the signal. A frequently used method of decorrelation is the use of a matched linear filter to reduce the...
, make flat) the spectrum, thereby allowing traditional lossless compression at the encoder to do its job; integration at the decoder restores the original signal. Codecs such as FLAC
FLAC
FLAC is a codec which allows digital audio to be losslessly compressed such that file size is reduced without any information being lost...
, Shorten
Shorten
Shorten is a file format used for compressing audio data. It is a form of data compression of files and is used to losslessly compress CD-quality audio files . Shorten is no longer developed and more recent lossless audio codecs such as FLAC, Monkey's Audio , TTA, and WavPack have become more...
and TTA
TTA (codec)
True Audio is a free software, real-time lossless audio codec, based on adaptive prognostic filters.Also, .tta is the generic extension to filenames of audio files created by True Audio codec.- Codec overview :...
use linear prediction
Linear prediction
Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples....
to estimate the spectrum of the signal. At the encoder, the estimator's inverse is used to whiten the signal by removing spectral peaks while the estimator is used to reconstruct the original signal at the decoder.
Evaluation criteria
Lossless audio codecs have no quality issues, so the usability can be estimated by
- Speed of compression and decompression
- Degree of compression
- Robustness and error correction
- Product support
Lossy audio compression
Lossy audio compression is used in a wide range of applications. In addition to the direct applications (mp3 players or computers), digitally compressed audio streams are used in most video DVDs; digital television; streaming media on the internetInternet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
; satellite and cable radio; and increasingly in terrestrial radio broadcasts. Lossy compression typically achieves far greater compression than lossless compression (data of 5 percent to 20 percent of the original stream, rather than 50 percent to 60 percent), by discarding less-critical data.
The innovation of lossy audio compression was to use psychoacoustics
Psychoacoustics
Psychoacoustics is the scientific study of sound perception. More specifically, it is the branch of science studying the psychological and physiological responses associated with sound...
to recognize that not all data in an audio stream can be perceived by the human auditory system. Most lossy compression reduces perceptual redundancy by first identifying sounds which are considered perceptually irrelevant, that is, sounds that are very hard to hear. Typical examples include high frequencies, or sounds that occur at the same time as louder sounds. Those sounds are coded with decreased accuracy or not coded at all.
Due to the nature of lossy algorithms, audio quality suffers when a file is decompressed and recompressed (digital generation loss). This makes lossy compression unsuitable for storing the intermediate results in professional audio engineering applications, such as sound editing and multitrack recording. However, they are very popular with end users (particularly MP3
MP3
MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...
), as a megabyte can store about a minute's worth of music at adequate quality.
Coding methods
In order to determine what information in an audio signal is perceptually irrelevant, most lossy compression algorithms use transforms such as the modified discrete cosine transform
Modified discrete cosine transform
The modified discrete cosine transform is a Fourier-related transform based on the type-IV discrete cosine transform , with the additional property of being lapped: it is designed to be performed on consecutive blocks of a larger dataset,...
(MDCT) to convert time domain
Time domain
Time domain is a term used to describe the analysis of mathematical functions, physical signals or time series of economic or environmental data, with respect to time. In the time domain, the signal or function's value is known for all real numbers, for the case of continuous time, or at various...
sampled waveforms into a transform domain. Once transformed, typically into the frequency domain
Frequency domain
In electronics, control systems engineering, and statistics, frequency domain is a term used to describe the domain for analysis of mathematical functions or signals with respect to frequency, rather than time....
, component frequencies can be allocated bits according to how audible they are. Audibility of spectral components is determined by first calculating a masking threshold
Masking threshold
The masking threshold is the sound pressure level of a sound needed to make the sound perceptible in the presence of another noice, called a "masker". This threshold depends upon the frequency, the kind of masker, and the kind of sound being masked...
, below which it is estimated that sounds will be beyond the limits of human perception.
The masking threshold is calculated using the absolute threshold of hearing
Absolute threshold of hearing
The absolute threshold of hearing is the minimum sound level of a pure tone that an average ear with normal hearing can hear with no other sound present. The absolute threshold relates to the sound that can just be heard by the organism...
and the principles of simultaneous masking
Simultaneous masking
In acoustics, simultaneous masking is masking between two concurrent sounds. Sometimes called frequency masking or spectral masking since it is often observed when the sounds share a frequency band e.g. two sine tones at 440 and 450 Hz can be perceived clearly when separated...
—the phenomenon wherein a signal is masked by another signal separated by frequency, and, in some cases, temporal masking
Temporal masking
Temporal masking or "non-simultaneous masking" occurs when a sudden stimulus sound makes inaudible other sounds which are present immediately preceding or following the stimulus...
—where a signal is masked by another signal separated by time. Equal-loudness contour
Equal-loudness contour
An equal-loudness contour is a measure of sound pressure , over the frequency spectrum, for which a listener perceives a constant loudness when presented with pure steady tones. The unit of measurement for loudness levels is the phon, and is arrived at by reference to equal-loudness contours...
s may also be used to weight the perceptual importance of different components. Models of the human ear-brain combination incorporating such effects are often called psychoacoustic models.
Other types of lossy compressors, such as the linear predictive coding
Linear predictive coding
Linear predictive coding is a tool used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model...
(LPC) used with speech, are source-based coders. These coders use a model of the sound's generator (such as the human vocal tract with LPC) to whiten the audio signal (i.e., flatten its spectrum) prior to quantization. LPC may also be thought of as a basic perceptual coding technique; reconstruction of an audio signal using a linear predictor shapes the coder's quantization noise into the spectrum of the target signal, partially masking it.
Usability
Usability of lossy audio codecs is determined by:
- Perceived audio quality
- Compression factor
- Speed of compression and decompression
- Inherent latency of algorithm (critical for real-time streaming applications; see below)
- Product support
Lossy formats are often used for the distribution of streaming audio, or interactive applications (such as the coding of speech for digital transmission in cell phone networks). In such applications, the data must be decompressed as the data flows, rather than after the entire data stream has been transmitted. Not all audio codecs can be used for streaming applications, and for such applications a codec designed to stream data effectively will usually be chosen.
Latency results from the methods used to encode and decode the data. Some codecs will analyze a longer segment of the data to optimize efficiency, and then code it in a manner that requires a larger segment of data at one time in order to decode. (Often codecs create segments called a "frame" to create discrete data segments for encoding and decoding.) The inherent latency
Latency (engineering)
Latency is a measure of time delay experienced in a system, the precise definition of which depends on the system and the time being measured. Latencies may have different meaning in different contexts.-Packet-switched networks:...
of the coding algorithm can be critical; for example, when there is two-way transmission of data, such as with a telephone conversation, significant delays may seriously degrade the perceived quality.
In contrast to the speed of compression, which is proportional to the number of operations required by the algorithm, here latency refers to the number of samples which must be analysed before a block of audio is processed. In the minimum case, latency is 0 zero samples (e.g., if the coder/decoder simply reduces the number of bits used to quantize the signal). Time domain algorithms such as LPC also often have low latencies, hence their popularity in speech coding for telephony. In algorithms such as MP3, however, a large number of samples have to be analyzed in order to implement a psychoacoustic model in the frequency domain, and latency is on the order of 23 ms (46 ms for two-way communication).
Speech encoding
Speech encoding
Speech encoding
Speech coding is the application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting...
is an important category of audio data compression. The perceptual models used to estimate what a human ear can hear are generally somewhat different from those used for music. The range of frequencies needed to convey the sounds of a human voice are normally far narrower than that needed for music, and the sound is normally less complex. As a result, speech can be encoded at high quality using relatively low bit rates.
This is accomplished, in general, by some combination of two approaches:
- Only encoding sounds that could be made by a single human voice.
- Throwing away more of the data in the signal—keeping just enough to reconstruct an "intelligible" voice rather than the full frequency range of human hearingHearing (sense)Hearing is the ability to perceive sound by detecting vibrations through an organ such as the ear. It is one of the traditional five senses...
.
Perhaps the earliest algorithms used in speech encoding (and audio data compression in general) were the A-law algorithm
A-law algorithm
An A-law algorithm is a standard companding algorithm, used in European digital communications systems to optimize, i.e., modify, the dynamic range of an analog signal for digitizing.It is similar to the μ-law algorithm used in North America and Japan....
and the µ-law algorithm.
History
A literature compendium for a large variety of audio coding systems was published in the IEEE Journal on Selected Areas in Communications (JSAC), February 1988. While there were some papers from before that time, this collection documented an entire variety of finished, working audio coders, nearly all of them using perceptual (i.e. masking) techniques and some kind of frequency analysis and back-end noiseless coding. Several of these papers remarked on the difficulty of obtaining good, clean digital audio for research purposes. Most, if not all, of the authors in the JSAC edition were also active in the MPEG-1 Audio committee.The world's first commercial broadcast automation audio compression system was developed by Oscar Bonello, an Engineering professor at the University of Buenos Aires
University of Buenos Aires
The University of Buenos Aires is the largest university in Argentina and the largest university by enrollment in Latin America. Founded on August 12, 1821 in the city of Buenos Aires, it consists of 13 faculties, 6 hospitals, 10 museums and is linked to 4 high schools: Colegio Nacional de Buenos...
. In 1983, using the psychoacoustic principle of the masking of critical bands first published in 1967, he started developing a practical application based on the recently developed IBM PC
IBM PC
The IBM Personal Computer, commonly known as the IBM PC, is the original version and progenitor of the IBM PC compatible hardware platform. It is IBM model number 5150, and was introduced on August 12, 1981...
computer, and the broadcast automation system was launched in 1987 under the name Audicom
Audicom
Audicom stands for: Audio en Computadora . Released in 1989, it was the world's first PC-based broadcast automation system to use audio data compression technology based on psychoacoustics....
. 20 years later, almost all the radio stations in the world were using similar technology, manufactured by a number of companies.
Video
Video compression is a combination of spatial image compressionImage compression
The objective of image compression is to reduce irrelevance and redundancy of the image data in order to be able to store or transmit data in an efficient form.- Lossy and lossless compression :...
and temporal motion compensation
Motion compensation
Motion compensation is an algorithmic technique employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture...
. Video compression is an example of the concept of source coding
Source coding
In information theory, Shannon's source coding theorem establishes the limits to possible data compression, and the operational meaning of the Shannon entropy....
in Information theory
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...
. Compressed video can effectively reduce the bandwidth
Bandwidth (computing)
In computer networking and computer science, bandwidth, network bandwidth, data bandwidth, or digital bandwidth is a measure of available or consumed data communication resources expressed in bits/second or multiples of it .Note that in textbooks on wireless communications, modem data transmission,...
required to transmit video
Video
Video is the technology of electronically capturing, recording, processing, storing, transmitting, and reconstructing a sequence of still images representing scenes in motion.- History :...
via terrestrial broadcast
Terrestrial television
Terrestrial television is a mode of television broadcasting which does not involve satellite transmission or cables — typically using radio waves through transmitting and receiving antennas or television antenna aerials...
, via cable TV, or via satellite TV services.
Video quality
Most video compression is lossy: it operates on the premise that much of the data present before compression is not necessary for achieving good perceptual quality. For example, DVDDVD-Video
DVD-Video is a consumer video format used to store digital video on DVD discs, and is currently the dominant consumer video format in Asia, North America, Europe, and Australia. Discs using the DVD-Video specification require a DVD drive and a MPEG-2 decoder...
s use a video coding standard called MPEG-2
MPEG-2
MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods which permit storage and transmission of movies using currently available storage media and transmission...
that can compress video data by 15 to 30 times, while still producing a picture quality
Video quality
Video quality is a characteristic of a video passed through a video transmission/processing system, a formal or informal measure of perceived video degradation...
that is generally considered high-quality for standard-definition video. Video compression is a tradeoff between disk space, video quality, and the cost of hardware
Hardware
Hardware is a general term for equipment such as keys, locks, hinges, latches, handles, wire, chains, plumbing supplies, tools, utensils, cutlery and machine parts. Household hardware is typically sold in hardware stores....
required to decompress the video in a reasonable time. However, if the video is overcompressed in a lossy manner, visible (and sometimes distracting) artifacts
Compression artifact
A compression artifact is a noticeable distortion of media caused by the application of lossy data compression....
can appear.
Video compression typically operates on square-shaped groups of neighboring pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....
s, often called macroblock
Macroblock
Macroblock is an image compression component and technique based on discrete cosine transform used on still images and video frames. Macroblocks are usually composed of two or more blocks of pixels. In the JPEG standard macroblocks are called MCU blocks....
s. These pixel groups or blocks of pixels are compared from one frame to the next and the video compression codec
Video codec
A video codec is a device or software that enables video compression and/or decompression for digital video. The compression usually employs lossy data compression. Historically, video was stored as an analog signal on magnetic tape...
(encode/decode scheme) sends only the differences
Residual frame
In video compression algorithms a residual frame is formed by subtracting the reference frame from the desired frame. This difference is known as the error or residual frame...
within those blocks. This works extremely well if the video has no motion. A still frame of text, for example, can be repeated with very little transmitted data. In areas of video with more motion, more pixels change from one frame to the next. When more pixels change, the video compression scheme must send more data to keep up with the larger number of pixels that are changing. If the video content includes an explosion, flames, a flock of thousands of birds, or any other image with a great deal of high-frequency detail, the quality will decrease, or the variable bitrate
Variable bitrate
Variable bitrate is a term used in telecommunications and computing that relates to the bitrate used in sound or video encoding. As opposed to constant bitrate , VBR files vary the amount of output data per time segment...
must be increased to render this added information with the same level of detail.
The programming provider has control over the amount of video compression applied to their video programming before it is sent to their distribution system. DVDs, Blu-ray discs, and HD DVD
HD DVD
HD DVD is a discontinued high-density optical disc format for storing data and high-definition video.Supported principally by Toshiba, HD DVD was envisioned to be the successor to the standard DVD format...
s have video compression applied during their mastering process, though Blu-ray and HD DVD have enough disc capacity that most compression applied in these formats is light, when compared to such examples as most video streamed on the internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
, or taken on a cellphone. Software used for storing video on hard drives or various optical disc formats will often have a lower image quality. High-bitrate video codecs with little or no compression exist for video post-production
Post-production
Post-production is part of filmmaking and the video production process. It occurs in the making of motion pictures, television programs, radio programs, advertising, audio recordings, photography, and digital art...
work, but create very large files and are thus almost never used for the distribution of finished videos. Once excessive lossy video compression compromises image quality, it is impossible to restore the image to its original quality.
Theory
Video is basically a three-dimensional array of colorColor
Color or colour is the visual perceptual property corresponding in humans to the categories called red, green, blue and others. Color derives from the spectrum of light interacting in the eye with the spectral sensitivities of the light receptors...
pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....
s. Two dimensions serve as spatial (horizontal and vertical) directions of the moving pictures, and one dimension represents the time domain
Time domain
Time domain is a term used to describe the analysis of mathematical functions, physical signals or time series of economic or environmental data, with respect to time. In the time domain, the signal or function's value is known for all real numbers, for the case of continuous time, or at various...
. A data frame
Data frame
In computer networking and telecommunication, a frame is a digital data transmission unit or data packet that includes frame synchronization, i.e. a sequence of bits or symbols making it possible for the receiver to detect the beginning and end of the packet in the stream of symbols or bits...
is a set of all pixels that correspond to a single time moment. Basically, a frame is the same as a still picture
Image
An image is an artifact, for example a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person.-Characteristics:...
.
Video data contains spatial and temporal redundancy
Redundancy (information theory)
Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Informally, it is the amount of wasted "space" used to transmit certain data...
. Similarities can thus be encoded by merely registering differences within a frame (spatial), and/or between frames (temporal). Spatial encoding is performed by taking advantage of the fact that the human eye is unable to distinguish small differences in color as easily as it can perceive changes in brightness, so that very similar areas of color can be "averaged out" in a similar way to jpeg images. With temporal compression only the changes from one frame to the next are encoded as often a large number of the pixels will be the same on a series of frames.
Lossless compression
Some forms of data compression are losslessLossless data compression
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange...
. This means that when the data is decompressed, the result is a bit-for-bit perfect match with the original. While lossless compression of video is possible, it is rarely used, as lossy compression results in far higher compression ratios at an acceptable level of quality.
Intraframe versus interframe compression
One of the most powerful techniques for compressing video is interframe compression. Interframe compression uses one or more earlier or later frames in a sequence to compress the current frame, while intraframe compression uses only the current frame, which is effectively image compressionImage compression
The objective of image compression is to reduce irrelevance and redundancy of the image data in order to be able to store or transmit data in an efficient form.- Lossy and lossless compression :...
.
The most commonly used method works by comparing each frame in the video with the previous one. If the frame contains areas where nothing has moved, the system simply issues a short command that copies that part of the previous frame, bit-for-bit, into the next one. If sections of the frame move in a simple manner, the compressor emits a (slightly longer) command that tells the decompresser to shift, rotate, lighten, or darken the copy: a longer command, but still much shorter than intraframe compression. Interframe compression works well for programs that will simply be played back by the viewer, but can cause problems if the video sequence needs to be edited.
Since interframe compression copies data from one frame to another, if the original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Some video formats, such as DV
DV
DV is a format for the digital recording and playing back of digital video. The DV codec was launched in 1995 with joint efforts of leading producers of video camcorders....
, compress each frame independently using intraframe compression. Making 'cuts' in intraframe-compressed video is almost as easy as editing uncompressed video: one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another difference between intraframe and interframe compression is that with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as "I frames" in MPEG-2
MPEG-2
MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods which permit storage and transmission of movies using currently available storage media and transmission...
) aren't allowed to copy data from other frames, and so require much more data than other frames nearby.
It is possible to build a computer-based video editor that spots problems caused when I frames are edited out while other frames need them. This has allowed newer formats like HDV
HDV
HDV is a format for recording of high-definition video on DV cassette tape. The format was originally developed by JVC and supported by Sony, Canon and Sharp...
to be used for editing. However, this process demands a lot more computing power than editing intraframe compressed video with the same picture quality.
Current forms
Today, nearly all commonly used video compression methods (e.g., those in standards approved by the ITU-TITU-T
The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications....
or ISO
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...
) apply a discrete cosine transform
Discrete cosine transform
A discrete cosine transform expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. DCTs are important to numerous applications in science and engineering, from lossy compression of audio and images A discrete cosine transform...
(DCT) for spatial redundancy reduction. Other methods, such as fractal compression
Fractal compression
Fractal compression is a lossy compression method for digital images, based on fractals. The method is best suited for textures and natural images, relying on the fact that parts of an image often resemble other parts of the same image...
, matching pursuit
Matching pursuit
Matching pursuit is a type of numerical technique which involves finding the "best matching" projections of multidimensional data onto an over-complete dictionary D...
and the use of a discrete wavelet transform
Discrete wavelet transform
In numerical analysis and functional analysis, a discrete wavelet transform is any wavelet transform for which the wavelets are discretely sampled...
(DWT) have been the subject of some research, but are typically not used in practical products (except for the use of wavelet coding as still-image coders without motion compensation). Interest in fractal compression seems to be waning, due to recent theoretical analysis showing a comparative lack of effectiveness of such methods.
Timeline
The following table is a partial history of international video compression standards.Year | Standard | Publisher | Popular Implementations |
---|---|---|---|
1984 | H.120 H.120 H.120 was the first digital video encoding standard. It was developed by COST 211 and published by the CCITT in 1984, with a revision in 1988 that included contributions proposed by other organizations... |
ITU-T ITU-T The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications.... |
|
1990 | H.261 H.261 H.261 is a ITU-T video coding standard, ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Video Coding Experts Group , and was the first video codec that was useful in practical terms.H.261 was originally designed for... |
ITU-T ITU-T The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications.... |
Videoconferencing, Videotelephony |
1993 | MPEG-1 Part 2 | ISO International Organization for Standardization The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial... , IEC International Electrotechnical Commission The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"... |
Video-CD |
1995 | H.262/MPEG-2 Part 2 H.262/MPEG-2 Part 2 H.262 or MPEG-2 Part 2 is a digital video compression and encoding standard developed and maintained jointly by ITU-T Video Coding Experts Group and ISO/IEC Moving Picture Experts Group . It is the second part of the ISO/IEC MPEG-2 standard... |
ISO International Organization for Standardization The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial... , IEC International Electrotechnical Commission The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"... , ITU-T ITU-T The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications.... |
DVD Video DVD A DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions.... , Blu-ray Blu-ray Disc Blu-ray Disc is an optical disc storage medium designed to supersede the DVD format. The plastic disc is 120 mm in diameter and 1.2 mm thick, the same size as DVDs and CDs. Blu-ray Discs contain 25 GB per layer, with dual layer discs being the norm for feature-length video discs... , Digital Video Broadcasting, SVCD |
1996 | H.263 H.263 H.263 is a video compression standard originally designed as a low-bitrate compressed format for videoconferencing. It was developed by the ITU-T Video Coding Experts Group in a project ending in 1995/1996 as one member of the H.26x family of video coding standards in the domain of the ITU-T.H.263... |
ITU-T ITU-T The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications.... |
Videoconferencing, Videotelephony, Video on Mobile Phones (3GP 3GP 3GP is a multimedia container format defined by the Third Generation Partnership Project for 3G UMTS multimedia services. It is used on 3G mobile phones but can also be played on some 2G and 4G phones.... ) |
1999 | MPEG-4 Part 2 MPEG-4 Part 2 MPEG-4 Part 2, MPEG-4 Visual is a video compression technology developed by MPEG. It belongs to the MPEG-4 ISO/IEC standards. It is a discrete cosine transform compression standard, similar to previous standards such as MPEG-1 and MPEG-2... |
ISO International Organization for Standardization The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial... , IEC International Electrotechnical Commission The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"... |
Video on Internet (DivX DivX DivX is a brand name of products created by DivX, Inc. , including the DivX Codec which has become popular due to its ability to compress lengthy video segments into small sizes while maintaining relatively high visual quality.There are two DivX codecs; the regular MPEG-4 Part 2 DivX codec and the... , Xvid XviD Xvid is a video codec library following the MPEG-4 standard, specifically MPEG-4 Part 2 Advanced Simple Profile . It uses ASP features such as b-frames, global and quarter pixel motion compensation, lumi masking, trellis quantization, and H.263, MPEG and custom quantization matrices.Xvid is a... ...) |
2003 | H.264/MPEG-4 AVC H.264/MPEG-4 AVC H.264/MPEG-4 Part 10 or AVC is a standard for video compression, and is currently one of the most commonly used formats for the recording, compression, and distribution of high definition video... |
ISO International Organization for Standardization The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial... , IEC International Electrotechnical Commission The International Electrotechnical Commission is a non-profit, non-governmental international standards organization that prepares and publishes International Standards for all electrical, electronic and related technologies – collectively known as "electrotechnology"... , ITU-T ITU-T The ITU Telecommunication Standardization Sector is one of the three sectors of the International Telecommunication Union ; it coordinates standards for telecommunications.... |
Blu-ray Blu-ray Disc Blu-ray Disc is an optical disc storage medium designed to supersede the DVD format. The plastic disc is 120 mm in diameter and 1.2 mm thick, the same size as DVDs and CDs. Blu-ray Discs contain 25 GB per layer, with dual layer discs being the norm for feature-length video discs... , Digital Video Broadcasting, iPod Video, HD DVD HD DVD HD DVD is a discontinued high-density optical disc format for storing data and high-definition video.Supported principally by Toshiba, HD DVD was envisioned to be the successor to the standard DVD format... |
2008 | VC-2 (Dirac) | ISO International Organization for Standardization The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial... , BBC BBC The British Broadcasting Corporation is a British public service broadcaster. Its headquarters is at Broadcasting House in the City of Westminster, London. It is the largest broadcaster in the world, with about 23,000 staff... |
Video on Internet, HDTV broadcast, UHDTV |
See also
- Algorithmic complexity theory
- Audio signal processingAudio signal processingAudio signal processing, sometimes referred to as audio processing, is the intentional alteration of auditory signals, or sound. As audio signals may be electronically represented in either digital or analog format, signal processing may occur in either domain...
- Audio storage
- Auditory maskingAuditory maskingAuditory masking occurs when the perception of one sound is affected by the presence of another sound.- Simultaneous masking :Simultaneous masking is when a sound is made inaudible by a "masker", a noise or unwanted sound of the same duration as the original sound.-Critical bandwidth:If two sounds...
- Burrows–Wheeler transform
- Calgary CorpusCalgary CorpusThe Calgary Corpus is a collection of text and binary data files, commonly used for comparing data compression algorithms. It was created by Ian Witten, Tim Bell and John Cleary from the University of Calgary in 1987 and was commonly used in the 1990s...
- Canterbury CorpusCanterbury CorpusThe Canterbury Corpus is a collection of files intended for use as a benchmark for testing lossless data compression algorithms. It was created in 1997 at the University of Canterbury, New Zealand and designed to replace the Calgary Corpus.- External links :...
- Comparison of audio codecsComparison of audio codecsThe following tables compare general and technical information for a variety of audio formats and audio compression formats. For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening test....
- Comparison of file archiversComparison of file archiversThe following tables compare general and technical information for a number of file archivers. Please see the individual products' articles for further information. They are neither all-inclusive nor are some entries necessarily up to date...
- Context mixingContext mixingContext mixing is a type of data compression algorithm in which the next-symbol predictions of two or more statistical models are combined to yield a prediction that is often more accurate than any of the individual predictions. For example, one simple method is to average the probabilities...
- Data compression symmetryData compression symmetrySymmetry and asymmetry, in the context of data compression, refer to the time relation between compression and decompression for a given compression algorithm....
- Data deduplicationData deduplicationIn computing, data deduplication is a specialized data compression technique for eliminating coarse-grained redundant data. The technique is used to improve storage utilization and can also be applied to network data transfers to reduce the number of bytes that must be sent across a link...
- D-frame
- Dictionary coderDictionary coderA dictionary coder, also sometimes known as a substitution coder, is a class of lossless data compression algorithms which operate by searching for matches between the text to be compressed and a set of strings contained in a data structure maintained by the encoder...
- Digital signal processingDigital signal processingDigital signal processing is concerned with the representation of discrete time signals by a sequence of numbers or symbols and the processing of these signals. Digital signal processing and analog signal processing are subfields of signal processing...
- Distributed source codingDistributed source codingDistributed source coding is an important problem in information theory and communication. DSC problems regard the compression of multiple correlated information sources that do not communicate with each other...
- Dyadic distributionDyadic distributionA dyadic distribution is a specific type of discrete or categorical probability distribution that is of some theoretical importance in data compression.-Definition:...
- Dynamic Markov CompressionDynamic Markov compressionDynamic Markov compression is a lossless data compression algorithm developed by Gordon Cormack and Nigel Horspool . It uses predictive arithmetic coding similar to prediction by partial matching , except that the input is predicted one bit at a time...
- Elias gamma codingElias gamma codingElias gamma code is a universal code encoding positive integers developed by Peter Elias. It is used most commonly when coding integers whose upper-bound cannot be determined beforehand.-Encoding:To code a number:#Write it in binary....
- Entropy encodingEntropy encodingIn information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium....
- Fibonacci codingFibonacci codingIn mathematics, Fibonacci coding is a universal code which encodes positive integers into binary code words. Each code word ends with "11" and contains no other instances of "11" before the end.-Definition:...
- Fractal transformFractal transformThe fractal transform is a technique invented by Michael Barnsley et al. to perform lossy image compression.This first practical fractal compression system for digital images resembles a vector quantization system using the image itself as the codebook....
- Golomb codingGolomb codingGolomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making Golomb coding highly suitable for situations in which the...
- HTTP compressionHttp compressionHTTP compression is a capability that can be built into web servers and web clients to make better use of available bandwidth , and provide faster transmission speeds between both...
- Image compressionImage compressionThe objective of image compression is to reduce irrelevance and redundancy of the image data in order to be able to store or transmit data in an efficient form.- Lossy and lossless compression :...
- Information entropyInformation entropyIn information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits...
- List of archive formats
- List of codecs
- Magic compression algorithm
- Minimum description lengthMinimum description lengthThe minimum description length principle is a formalization of Occam's Razor in which the best hypothesis for a given set of data is the one that leads to the best compression of the data. MDL was introduced by Jorma Rissanen in 1978...
- Minimum message lengthMinimum message lengthMinimum message length is a formal information theory restatement of Occam's Razor: even when models are not equal in goodness of fit accuracy to the observed data, the one generating the shortest overall message is more likely to be correct...
- Modulo-N codeModulo-N codeModulo-N code is a lossy compression algorithm used to compress correlated data sources using modulo arithmetic.-Compression:When applied to two nodes in a network whose data are in close range of each other Modulo-N code...
- Mu-law
- Prediction by partial matching
- PsychoacousticsPsychoacousticsPsychoacoustics is the scientific study of sound perception. More specifically, it is the branch of science studying the psychological and physiological responses associated with sound...
- Range encodingRange encodingRange encoding is a data compression method defined by G. Nigel N. Martin in a 1979 paper Range encoding is a form of arithmetic coding that was historically of interest for avoiding some patents on particular later-developed arithmetic coding techniques...
- Run-length encodingRun-length encodingRun-length encoding is a very simple form of data compression in which runs of data are stored as a single data value and count, rather than as the original run...
- Self-extracting archiveSelf-extracting archiveA self-extracting archive is a computer application which contains a file archive, as well as programming to extract this information. Such file archives do not require a second executable file or program to extract from the archive, as archive files usually require...
- Subband encoding
- Subjective video qualitySubjective video qualitySubjective video quality is a subjective characteristic of video quality. It is concerned with how video is perceived by a viewer and designates his or her opinion on a particular video sequence...
- Transcoding
- Universal code (data compression)Universal code (data compression)In data compression, a universal code for integers is a prefix code that maps the positive integers onto binary codewords, with the additional property that whatever the true probability distribution on integers, as long as the distribution is monotonic , the expected lengths of the codewords are...
- Vector quantizationVector quantizationVector quantization is a classical quantization technique from signal processing which allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points into groups having...
- Video compression formatVideo compression formatA video compression format or a video compression specification is a specification for digitally representing a video as a file or a bitstream. Examples of video compression formats are MPEG-2 Part 2, MPEG-4 Part 2, H.264 , Theora, Dirac, RealVideo RV40, and VP8...
- Video compression picture types
- Video qualityVideo qualityVideo quality is a characteristic of a video passed through a video transmission/processing system, a formal or informal measure of perceived video degradation...
- Wavelet compression
External links
- Data Compression Basics (Video)
- Video compression 4:2:2 10-bit and its benefits
- Why does 10-bit save bandwidth (even when content is 8-bit)?
- Which compression technology should be used
- Wiley - Introduction to Compression Theory
- EBU subjective listening tests on low-bitrate audio codecs
- Audio Archiving Guide: Music Formats (Guide for helping a user pick out the right codec)
- MPEG 1&2 video compression intro (pdf format)
- hydrogenaudio.org wiki comparison
- Introduction to Data Compression by Guy E Blelloch from CMUCarnegie Mellon UniversityCarnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....
- HD Greetings - 1080p Uncompressed source material for compression testing and research
- Explanation of lossless signal compression method used by most codecs
- Interactive blind listening tests of audio codecs over the internet
- TestVid - 2,000+ HD and other uncompressed source video clips for compression testing
- Videsignline - Intro to Video Compression