Padding (cryptography)
Encyclopedia
Classical cryptography
Official messages often start and end in predictable ways: My dear ambassador, Weather report, Sincerely yours, etc. The primary use of padding with classical cipherClassical cipher
A cipher is a means of concealing a message, where letters of the message are substituted or transposed for other letters, letter pairs, and sometimes for many letters. In cryptography, a classical cipher is a type of cipher that was used historically but now has fallen, for the most part, into...
s is to prevent the cryptanalyst from using that predictability to find cribs that aid in breaking the encryption. Random length padding also prevents an attacker from knowing the exact length of the plaintext message.
Many classical ciphers arrange the plaintext into particular patterns (e.g., squares, rectangles, etc) and if the plaintext doesn't exactly fit, it is often necessary to supply additional letters to fill out the pattern. Using nonsense letters for this purpose has a side benefit of making some kinds of cryptanalysis more difficult.
A famous example of classical padding which caused a great misunderstanding is "the world wonders
The world wonders
"The world wonders" was a phrase used as security padding in an encrypted message sent from Admiral Chester Nimitz to Admiral William Halsey, Jr. on October 25, 1944 during the Battle of Leyte Gulf...
".
Hash functions
Most modern cryptographic hash functionCryptographic hash function
A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the hash value, such that an accidental or intentional change to the data will change the hash value...
s process messages in fixed-length blocks; all but the earliest (and most broken) of these hash functions include some sort of padding scheme. It is critical for cryptographic hash functions to employ termination schemes that prevent a hash from being extended; without such a scheme, many collision attacks become significantly easier. For example, one can find n collisions and produce collision by simply mix and matching. Targeted collisions do not typically become easier if a padding scheme is absent, but other domain-specific problems may arise: cryptographic hash functions used in other constructions such as Message authentication code
Message authentication code
In cryptography, a message authentication code is a short piece of information used to authenticate a message.A MAC algorithm, sometimes called a keyed hash function, accepts as input a secret key and an arbitrary-length message to be authenticated, and outputs a MAC...
s cause the MAC to be completely broken if they are extendable.
Many padding schemes are based on appending predictable data to the final block (for example, the pad could be derived from the total length of the message, such as in Merkle-Damgård construction, a dated but still relevant technique).
CBC mode
Cipher-block chaining (CBC) mode is a popular block cipher mode of operation. It requires messages whose length is a multiple of the block size (typically 8 or 16 bytes), so messages have to be padded to bring them to this length. One method is to fill out the last block with a 1-bit followed by zero bits. If the input happens to fill up an entire block, another block is added to accommodate the padding; otherwise, the end of the input plaintext might be misinterpreted as padding. Another method is to append n bytes with value (n−1) to the end of the plaintext to fill out a complete block. If the message already exactly fills a block, then for the same reasons as before, a full block of padding block is added. This means the padding is either one byte of 0, or two bytes of 1 etc.More intricate ways of ending a message such as ciphertext stealing
Ciphertext stealing
In cryptography, ciphertext stealing is a general method of using a block cipher mode of operation that allows for processing of messages that are not evenly divisible into blocks without resulting in any expansion of the ciphertext, at the cost of slightly increased complexity.-General...
or residual block termination
Residual block termination
In cryptography, residual block termination is a variation of cipher block chaining mode that does not require any padding. It does this by effectively changing to cipher feedback mode for one block...
avoid the need for such padding. However, today, CTR mode is largely replacing CBC mode, and CTR mode doesn't need padding at all.
There are timing attack
Timing attack
In cryptography, a timing attack is a side channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms...
s based on structured CBC padding.
Bit padding
A single set ('1') bit is added to the message and then as many reset ('0') bits as required (possibly none) are added. The number of reset ('0') bits added will depend on the block boundary to which the message needs to be extended. In bit terms this is "1000 ... 0000", in hex byte terms this is "80 00 ... 00 00".This method can be used to pad messages which are any number of bits long, not necessarily a whole number of bytes long. For example, a message of 23 bits that is padded with 9 bits in order to fill a 32-bit block:
... | 1011 1001 1101 0100 0010 0111 0000 0000 |
This padding is the first step of a two-step padding scheme used in many hash functions including MD5
MD5
The MD5 Message-Digest Algorithm is a widely used cryptographic hash function that produces a 128-bit hash value. Specified in RFC 1321, MD5 has been employed in a wide variety of security applications, and is also commonly used to check data integrity...
and SHA
Sha
For other uses, see Sha .Sha is a letter of the Cyrillic alphabet. It commonly represents the voiceless postalveolar fricative , like the pronunciation of ⟨sh⟩ in "sheep", or the somewhat similar voiceless retroflex fricative . It is used in every variation of the Cyrillic alphabet, for Slavic and...
. In this context, it is specified by RFC1321 step 3.1.
In the context of using block ciphers to process variable-size messages, this padding scheme is known as ISO/IEC 9797-1
ISO/IEC 9797-1
ISO/IEC 9797-1 Information technology — Security techniques — Message Authentication Codes — Part 1: Mechanisms using a block cipher is an international standard that defines methods for calculating a message authentication code over data.Rather than defining one specific...
Padding Method 2.
Byte padding
ANSI X.923In ANSI X.923 bytes filled with zeros (0)'s are padded and the last byte defines the padding boundaries or the number of padded bytes.
Example:
In the following example the block size is 8 bytes, and padding is required for 4 bytes (in Hexadecimal format)
... | DD DD DD DD DD DD DD DD | DD DD DD DD 00 00 00 04 |
ISO 10126
ISO 10126 (withdrawn, 2007) specifies that the padding should be done at the end of that last block with random bytes, and the padding boundary should be specified by the last byte.
Example:
In the following example the block size is 8 bytes and padding is required for 4 bytes
... | DD DD DD DD DD DD DD DD | DD DD DD DD 81 A6 23 04 |
PKCS7
PKCS7
PKCS
In cryptography, PKCS refers to a group of public-key cryptography standards devised and published by RSA Security.RSA Data Security Inc was assigned the licensing rights for the patent on the RSA asymmetric key algorithm and acquired the licensing rights to several other key patents as well...
is described in RFC 5652.
Padding is in whole bytes. The value of each added byte is the number of bytes that are added, i.e. N bytes, each of value N are added. The number of bytes added will depend on the block boundary to which the message needs to be extended.
The padding will be one of:
01
02 02
03 03 03
04 04 04 04
05 05 05 05 05
etc.
Example:
In the following example the block size is 8 bytes and padding is required for 4 bytes
... | DD DD DD DD DD DD DD DD | DD DD DD DD 04 04 04 04 |
PKCS5 padding is the same as PKCS7, except that technically it can only be used to pad 64 bit blocks. In practice the two are used interchangeably.
Zero padding
All the bytes that are required to be padded are padded with zero.Example:
In the following example the block size is 8 bytes and padding is required for 4 bytes
... | DD DD DD DD DD DD DD DD | DD DD DD DD 00 00 00 00 |
Zero padding may not be reversible if the original file ends with one or more zero bytes, making it impossible to distinguish between plaintext data bytes and padding bytes.
Public key cryptography
In public key cryptography, padding is the process of preparing a message for encryption or signing using a specification or scheme such as PKCS#1PKCS1
In cryptography, PKCS#1 is the first of a family of standards called Public-Key Cryptography Standards , published by RSA Laboratories. It provides the basic definitions of and recommendations for implementing the RSA algorithm for public-key cryptography...
v2.0, OAEP, PSS, PSSR, IEEE P1363 EMSA2 and EMSA5. A popular example is OAEP
Optimal Asymmetric Encryption Padding
In cryptography, Optimal Asymmetric Encryption Padding is a padding scheme often used together with RSA encryption. OAEP was introduced by Bellare and Rogaway....
used with RSA.
The operation is referred to as "padding" because originally, random material was simply appended to the message to make it long enough for the primitive, but this is not a secure form of padding and is no longer used. A modern padding scheme aims to ensure that the attacker cannot manipulate the plaintext to exploit the mathematical structure of the primitive and will usually be accompanied by a proof, often in the random oracle model, that breaking the padding scheme is as hard as solving the hard problem underlying the primitive.
Traffic analysis
Even if perfect cryptographic routines are used, the attacker can gain knowledge of the amount of traffic that was generated. The attacker might not know what Alice and BobAlice and Bob
The names Alice and Bob are commonly used placeholder names for archetypal characters in fields such as cryptography and physics. The names are used for convenience; for example, "Alice sends a message to Bob encrypted with his public key" is easier to follow than "Party A sends a message to Party...
were talking about, but can know that they were talking and how much they talked. In certain circumstances this can be very bad. Consider for example when a military is organising a secret attack against another nation: it may suffice to alert the other nation for them to know merely that there is a lot of secret activity going on.
As another example, when encrypting Voice Over IP streams that use variable bit rate encoding, the number of bits per unit of time is not obscured, and this can be exploited to guess spoken phrases.
Padding messages is a way to make it harder to do traffic analysis
Traffic analysis
Traffic analysis is the process of intercepting and examining messages in order to deduce information from patterns in communication. It can be performed even when the messages are encrypted and cannot be decrypted. In general, the greater the number of messages observed, or even intercepted and...
. Normally, a number of random bits are appended to the end of the message with an indication at the end how much this random data is. The randomness should have a minimum value of 0, a maximum number of N and an even distribution between the two extremes. Note, that increasing 0 does not help, only increasing N helps, though that also means that a lower percentage of the channel will be used to transmit real data. Also note, that since the cryptographic routine is assumed to be uncrackable (otherwise the padding length itself is crackable), it does not help to put the padding anywhere else, e.g. at the beginning, in the middle, or in a sporadic manner. For the same reason, padding can be structured (e.g. it can simply be a set of zeros) - though structured padding can be hazard, as explained in timing attack
Timing attack
In cryptography, a timing attack is a side channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms...
.
See also
- Russian copulationRussian copulationIn cryptography, Russian copulation is a method of rearranging plaintext before encryption so as to conceal stereotyped headers, salutations, introductions, endings, signatures, etc...
, another technique to prevent cribs - Initialisation vector, salt (cryptography)Salt (cryptography)In cryptography, a salt consists of random bits, creating one of the inputs to a one-way function. The other input is usually a password or passphrase. The output of the one-way function can be stored rather than the password, and still be used for authenticating users. The one-way function...
, which are sometimes confused with padding - Key encapsulationKey encapsulationKey encapsulation mechanisms are a class of encryption techniques designed to secure symmetric cryptographic key material for transmission using asymmetric algorithms. In practice, public key systems are clumsy to use in transmitting long messages. Instead they are often used to exchange...
, an alternative to padding for public key systems used to exchange symmetric keys