Raptor codes
Encyclopedia
In computer science
, raptor codes (rapid tornado; see Tornado codes
) are the first known class of fountain codes with linear time encoding and decoding. They were invented by Amin Shokrollahi
in 2000/2001 and were first published in 2004 as an extended abstract. Raptor codes are a significant theoretical and practical improvement over LT codes
, which were the first practical class of fountain codes.
Raptor codes, as with fountain codes in general, encode a given message consisting of a number of symbols, k, into a potentially limitless sequence of encoding symbols such that knowledge of any k or more encoding symbols allows the message to be recovered with some non-zero probability. The probability that the message can be recovered increases with the number of symbols received above k becoming very close to 1, once the number of received symbols is only very slightly larger than k. For example, with the latest generation of Raptor codes, the RaptorQ codes, the chance of decoding failure when k symbols have been received is less than 1%, and the chance of decoding failure when k+2 symbols have been received is less than one in a million. A symbol can be any size, from a single byte to hundreds or thousands of bytes.
Raptor codes may be systematic or non-systematic. In the systematic case, the symbols of the original message are included within the set of encoding symbols. An example of a systematic raptor code is the code defined by the 3rd Generation Partnership Project for use in mobile cellular wireless broadcast and multicast and also used by DVB-H standards for IP datacast to handheld devices (see external links). The Raptor codes in these standards is defined also in IETF RFC 5053. The most advanced version of a practical Raptor code is RaptorQ defined in IETF RFC 6330.
Online codes
are another example of a non-systematic raptor code.
A fixed rate erasure code
, usually with a fairly high rate, is applied as a 'pre-code' or 'outer code'. This pre-code may itself be a concatenation of multiple codes, for example in the code standardized by 3GPP a high density parity check code derived from the binary Gray sequence is concatenated with a simple regular low density parity check code. Another possibility would be a concatenation of a Hamming code
with a low density parity check code.
The inner code takes the result of the pre-coding operation and generates a sequence of encoding symbols. The inner code is a form of LT codes. Each encoding symbol is the XOR of a randomly chosen set of symbols from the pre-code output. The number of symbols which are XOR'ed together to form an output symbol is chosen randomly for each output symbol according to a specific probability distribution.
This distribution, as well as the mechanism for generating random numbers for sampling this distribution and for choosing the symbols to be XOR'ed, must be known to both sender and receiver. In one approach, each symbol is accompanied with an identifier which can be used as a seed to a pseudo-random number generator to generate this information, with the same process being followed by both sender and receiver.
In the case of non-systematic raptor codes, the source data to be encoded is used as the input to the pre-coding stage.
In the case of systematic raptor codes, the input to the pre-coding stage is obtained by first applying the inverse of the encoding operation that generates the first k output symbols to the source data. Thus, applying the normal encoding operation to the resulting symbols causes the original source symbols to be regenerated as the first k output symbols of the code. It is necessary to ensure that the random
processes which generate the first k output symbols generate an operation which is invertible.
In a combined approach, the relationships between symbols defined by both the inner and outer codes are considered as a single combined set of simultaneous equations which can be solved by the usual means, typically by Gaussian elimination
.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, raptor codes (rapid tornado; see Tornado codes
Tornado codes
In computer science, Tornado codes are a class of erasure codes that support error correction. Tornado codes require a constant C more redundant blocks than the more data-efficient Reed-Solomon erasure codes, but are much faster to generate and can fix erasures faster...
) are the first known class of fountain codes with linear time encoding and decoding. They were invented by Amin Shokrollahi
Amin Shokrollahi
Amin Shokrollahi is a mathematician who has worked on a variety of topics including coding theory, and algebraic complexity theory. He is best known for his work on iterative decoding of graph based codes for which he received the IEEE Information Theory Best Paper Award of 2002...
in 2000/2001 and were first published in 2004 as an extended abstract. Raptor codes are a significant theoretical and practical improvement over LT codes
LT codes
In computer science, Luby transform codes are the first class of practical fountain codes that are near optimal erasure correcting codes invented by Michael Luby in 1998 and published in 2002. Like some other fountain codes, LT codes depend on sparse bipartite graphs to trade reception overhead...
, which were the first practical class of fountain codes.
Raptor codes, as with fountain codes in general, encode a given message consisting of a number of symbols, k, into a potentially limitless sequence of encoding symbols such that knowledge of any k or more encoding symbols allows the message to be recovered with some non-zero probability. The probability that the message can be recovered increases with the number of symbols received above k becoming very close to 1, once the number of received symbols is only very slightly larger than k. For example, with the latest generation of Raptor codes, the RaptorQ codes, the chance of decoding failure when k symbols have been received is less than 1%, and the chance of decoding failure when k+2 symbols have been received is less than one in a million. A symbol can be any size, from a single byte to hundreds or thousands of bytes.
Raptor codes may be systematic or non-systematic. In the systematic case, the symbols of the original message are included within the set of encoding symbols. An example of a systematic raptor code is the code defined by the 3rd Generation Partnership Project for use in mobile cellular wireless broadcast and multicast and also used by DVB-H standards for IP datacast to handheld devices (see external links). The Raptor codes in these standards is defined also in IETF RFC 5053. The most advanced version of a practical Raptor code is RaptorQ defined in IETF RFC 6330.
Online codes
Online codes
In computer science, online codes are an example of rateless erasure codes. These codes can encode a message into a number of symbols such that knowledge of any fraction of them allows one to recover the original message...
are another example of a non-systematic raptor code.
Overview
Raptor codes are formed by the concatenation of two codes.A fixed rate erasure code
Erasure code
In information theory, an erasure code is a forward error correction code for the binary erasure channel, which transforms a message of k symbols into a longer message with n symbols such that the original message can be recovered from a subset of the n symbols...
, usually with a fairly high rate, is applied as a 'pre-code' or 'outer code'. This pre-code may itself be a concatenation of multiple codes, for example in the code standardized by 3GPP a high density parity check code derived from the binary Gray sequence is concatenated with a simple regular low density parity check code. Another possibility would be a concatenation of a Hamming code
Hamming code
In telecommunication, Hamming codes are a family of linear error-correcting codes that generalize the Hamming-code invented by Richard Hamming in 1950. Hamming codes can detect up to two and correct up to one bit errors. By contrast, the simple parity code cannot correct errors, and can detect only...
with a low density parity check code.
The inner code takes the result of the pre-coding operation and generates a sequence of encoding symbols. The inner code is a form of LT codes. Each encoding symbol is the XOR of a randomly chosen set of symbols from the pre-code output. The number of symbols which are XOR'ed together to form an output symbol is chosen randomly for each output symbol according to a specific probability distribution.
This distribution, as well as the mechanism for generating random numbers for sampling this distribution and for choosing the symbols to be XOR'ed, must be known to both sender and receiver. In one approach, each symbol is accompanied with an identifier which can be used as a seed to a pseudo-random number generator to generate this information, with the same process being followed by both sender and receiver.
In the case of non-systematic raptor codes, the source data to be encoded is used as the input to the pre-coding stage.
In the case of systematic raptor codes, the input to the pre-coding stage is obtained by first applying the inverse of the encoding operation that generates the first k output symbols to the source data. Thus, applying the normal encoding operation to the resulting symbols causes the original source symbols to be regenerated as the first k output symbols of the code. It is necessary to ensure that the random
processes which generate the first k output symbols generate an operation which is invertible.
Decoding
Two approaches are possible for decoding raptor codes. In a concatenated approach, the inner code is decoded first, using a belief propagation algorithm, as used for the LT codes. Decoding succeeds if this operation recovers a sufficient number of symbols, such that the outer code can recover the remaining symbols using the decoding algorithm appropriate for that code.In a combined approach, the relationships between symbols defined by both the inner and outer codes are considered as a single combined set of simultaneous equations which can be solved by the usual means, typically by Gaussian elimination
Gaussian elimination
In linear algebra, Gaussian elimination is an algorithm for solving systems of linear equations. It can also be used to find the rank of a matrix, to calculate the determinant of a matrix, and to calculate the inverse of an invertible square matrix...
.
Computational complexity
Raptor codes require O(1) time to generate an encoding symbol. Decoding a message of length k with a belief propagation decoding algorithm require O(k) time for the appropriate choice of inner/outer codes.See also
- Erasure codeErasure codeIn information theory, an erasure code is a forward error correction code for the binary erasure channel, which transforms a message of k symbols into a longer message with n symbols such that the original message can be recovered from a subset of the n symbols...
- LT code
- Fountain codes
- Michael LubyMichael LubyMichael George Luby is a mathematician and computer scientist, VP Technology at Qualcomm and former co-founder and Chief Technology Officer of Digital Fountain. In coding theory he is known for leading the invention of the Tornado codes and the LT codes...
- Tornado codesTornado codesIn computer science, Tornado codes are a class of erasure codes that support error correction. Tornado codes require a constant C more redundant blocks than the more data-efficient Reed-Solomon erasure codes, but are much faster to generate and can fix erasures faster...