Code (cryptography)
Encyclopedia
In cryptography
, a code is a method used to transform a message
into an obscured form, preventing those who do not possess special information, or key
, required to apply the transform from understanding what is actually transmitted. The usual method is to use a codebook
with a list of common phrases or words matched with a codeword. Encoded messages are sometimes termed codetext, while the original message is usually referred to as plaintext
.
Terms like code and in code are often used to refer to any form of encryption
. However, there is an important distinction between codes and cipher
s in technical work; it is, essentially, the scope of the transformation involved. Codes operate at the level of meaning; that is, words or phrases are converted into something else. Ciphers work at the level of individual letters, or small groups of letters, or even, in modern ciphers, with individual bits. While a code might transform "change" into "CVGDK" or "cocktail lounge", a cipher transforms elements below the semantic level, i.e., below the level of meaning. The "a" in "attack" might be converted to "Q", the first "t" to "f", the second "t" to "3", and so on. Ciphers are more convenient than codes in some situations, there being no need for a codebook, with its inherently limited number of valid messages, and the possibility of fast automatic operation on computer
s.
Codes were long believed to be more secure than ciphers, since (if the compiler of the codebook did a good job) there is no pattern of transformation which can be discovered, whereas ciphers use a consistent transformation, which can potentially be identified and reversed (except in the case of the one-time pad
).
However, such "one-part" codes had a certain predictability that made it easier for others to notice patterns and "crack
" or "break" the message, revealing the plaintext, or part of it. In order to make life more difficult for codebreakers, codemakers designed codes with no predictable relationship between the codegroups and the ordering of the matching plaintext. In practice, this meant that two codebooks were now required, one to find codegroups for encoding, the other to look up codegroups to find plaintext for decoding. Students of foreign languages work much the same way; for, say, a Frenchman studying English, there is need of both an English-French and a French-English dictionary. Such "two-part" codes required more effort to develop, and twice as much effort to distribute (and discard safely when replaced), but they were harder to break.
Sometimes messages are not prearranged and rely on shared knowledge hopefully known only to the recipients. An example is the telegram sent to U.S. President Harry Truman, then in Potsdam
to meet with Soviet premier Joseph Stalin
, informing Truman of the first successful test
of an atomic bomb.
See also one-time pad
, an unrelated cypher algorithm
Example: Any sentence where 'day' and 'night' are used means 'attack'. The location mentioned in the following sentence specifies the location to be attacked.
An early use of the term appears to be by George Perrault, a character in the science fiction book Friday
by Robert A. Heinlein
:
Richard Miniter, author of Losing Bin Laden: How Bill Clinton's Failures Unleashed Global Terror, was quoted in an interview by UPI Technology News:
Terrorism expert Magnus Ranstorp said that the men who carried out the September 11, 2001, attacks
on the United States used basic e-mail and what he calls "idiot code" to discuss their plans.
One fingerhold on a simple code is the fact that some words are more common than others, such as "the" or "a" in English. In telegraphic messages, the codegroup for "STOP" (i.e., end of sentence or paragraph) is usually very common. This helps define the structure of the message in terms of sentences, if not their meaning, and this is cryptanalytically useful.
Further progress can be made against a code by collecting many codetexts encrypted with the same code and then using information from other sources
For example, a particular codegroup found almost exclusively in messages from a particular army and nowhere else might very well indicate the commander of that army. A codegroup that appears in messages preceding an attack on a particular location may very well stand for that location.
Of course, cribs can be an immediate giveaway to the definitions of codegroups. As codegroups are determined, they can gradually build up a critical mass, with more and more codegroups revealed from context and educated guesswork. One-part codes are more vulnerable to such educated guesswork than two-part codes, since if the codenumber "26839" of a one-part code is determined to stand for "bulldozer", then the lower codenumber "17598" will likely stand for a plaintext word that starts with "a" or "b". At least, for simple one part codes.
Various tricks can be used to "plant" or "sow" information into a coded message, for example by executing a raid at a particular time and location against an enemy, and then examining code messages sent after the raid. Coding errors are a particularly useful fingerhold into a code; people reliably make errors, sometimes disastrous ones. Of course, planting data and exploiting errors works against ciphers as well.
Constructing a new code is like building a new language and writing a dictionary for it; it was an especially big job before computers. If a code is compromised, the entire task must be done all over again, and that means a lot of work for both cryptographers and the code users. In practice, when codes were in widespread use, they were usually changed on a periodic basis to frustrate codebreakers, and to limit the useful life of stolen or copied codebooks.
Once codes have been created, codebook distribution is logistically clumsy, and increases chances the code will be compromised. There is a saying that "Three people can keep a secret if two of them are dead," Benjamin Franklin - Wikiquote and though it may be something of an exaggeration, a secret becomes harder to keep if it is shared among several people. Codes can be thought reasonably secure if they are only used by a few careful people, but if whole armies use the same codebook, security becomes much more difficult.
In contrast, the security of ciphers is generally dependent on protecting the cipher keys. Cipher keys can be stolen and people can betray them, but they are much easier to change and distribute.
, was of this design, as were several of the (confusingly named) Royal Navy Cyphers used after WWI and into WWII.
One might wonder why a code would be used if it had to be enciphered to provide security. As well as providing security, a well designed code can also compress
the message, and provide some degree of automatic error correction. For ciphers, the same degree of error correction has generally required use of computers.
:Category:Encodings
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
, a code is a method used to transform a message
Message
A message in its most general meaning is an object of communication. It is a vessel which provides information. Yet, it can also be this information. Therefore, its meaning is dependent upon the context in which it is used; the term may apply to both the information and its form...
into an obscured form, preventing those who do not possess special information, or key
Key (cryptography)
In cryptography, a key is a piece of information that determines the functional output of a cryptographic algorithm or cipher. Without a key, the algorithm would produce no useful result. In encryption, a key specifies the particular transformation of plaintext into ciphertext, or vice versa...
, required to apply the transform from understanding what is actually transmitted. The usual method is to use a codebook
Codebook
A codebook is a type of document used for gathering and storing codes. Originally codebooks were often literally books, but today codebook is a byword for the complete record of a series of codes, regardless of physical format.-Cryptography:...
with a list of common phrases or words matched with a codeword. Encoded messages are sometimes termed codetext, while the original message is usually referred to as plaintext
Plaintext
In cryptography, plaintext is information a sender wishes to transmit to a receiver. Cleartext is often used as a synonym. Before the computer era, plaintext most commonly meant message text in the language of the communicating parties....
.
Terms like code and in code are often used to refer to any form of encryption
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...
. However, there is an important distinction between codes and cipher
Cipher
In cryptography, a cipher is an algorithm for performing encryption or decryption — a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. In non-technical usage, a “cipher” is the same thing as a “code”; however, the concepts...
s in technical work; it is, essentially, the scope of the transformation involved. Codes operate at the level of meaning; that is, words or phrases are converted into something else. Ciphers work at the level of individual letters, or small groups of letters, or even, in modern ciphers, with individual bits. While a code might transform "change" into "CVGDK" or "cocktail lounge", a cipher transforms elements below the semantic level, i.e., below the level of meaning. The "a" in "attack" might be converted to "Q", the first "t" to "f", the second "t" to "3", and so on. Ciphers are more convenient than codes in some situations, there being no need for a codebook, with its inherently limited number of valid messages, and the possibility of fast automatic operation on computer
Computer
A computer is a programmable machine designed to sequentially and automatically carry out a sequence of arithmetic or logical operations. The particular sequence of operations can be changed readily, allowing the computer to solve more than one kind of problem...
s.
Codes were long believed to be more secure than ciphers, since (if the compiler of the codebook did a good job) there is no pattern of transformation which can be discovered, whereas ciphers use a consistent transformation, which can potentially be identified and reversed (except in the case of the one-time pad
One-time pad
In cryptography, the one-time pad is a type of encryption, which has been proven to be impossible to crack if used correctly. Each bit or character from the plaintext is encrypted by a modular addition with a bit or character from a secret random key of the same length as the plaintext, resulting...
).
One- and two-part codes
Codes are defined by "codebooks" (physical or notional), which are dictionaries of codegroups listed with their corresponding plaintext. Codes originally had the codegroups assigned in 'plaintext order' for convenience of the code designed, or the encoder. For example, in a code using numeric code groups, a plaintext word starting with "a" would have a low-value group, while one starting with "z" would have a high-value group. The same codebook could be used to "encode" a plaintext message into a coded message or "codetext", and "decode" a codetext back into plaintext message.However, such "one-part" codes had a certain predictability that made it easier for others to notice patterns and "crack
Cryptanalysis
Cryptanalysis is the study of methods for obtaining the meaning of encrypted information, without access to the secret information that is normally required to do so. Typically, this involves knowing how the system works and finding a secret key...
" or "break" the message, revealing the plaintext, or part of it. In order to make life more difficult for codebreakers, codemakers designed codes with no predictable relationship between the codegroups and the ordering of the matching plaintext. In practice, this meant that two codebooks were now required, one to find codegroups for encoding, the other to look up codegroups to find plaintext for decoding. Students of foreign languages work much the same way; for, say, a Frenchman studying English, there is need of both an English-French and a French-English dictionary. Such "two-part" codes required more effort to develop, and twice as much effort to distribute (and discard safely when replaced), but they were harder to break.
One-time code
A one-time code is a prearranged word, phrase or symbol that is intended to be used only once to convey a simple message, often the signal to execute or abort some plan or confirm that it has succeeded or failed. One-time codes are often designed to be included in what would appear to be an innocent conversation. Done properly they are almost impossible to detect, though a trained analyst monitoring the communications of someone who has already aroused suspicion might be able to recognize a comment like "Aunt Bertha has gone into labor" as having an ominous meaning. Famous example of one time codes include:- "One if by land; two if by sea" in "Paul Revere's RidePaul Revere's Ride (poem)"Paul Revere's Ride" is a poem by an American poet Henry Wadsworth Longfellow that commemorates the actions of American patriot Paul Revere on April 18, 1775.-Overview:...
" made famous in the poem by Henry Wadsworth LongfellowHenry Wadsworth LongfellowHenry Wadsworth Longfellow was an American poet and educator whose works include "Paul Revere's Ride", The Song of Hiawatha, and Evangeline... - "Climb Mount Niitaka" - the signal to Japanese planes to begin the attack on Pearl HarborAttack on Pearl HarborThe attack on Pearl Harbor was a surprise military strike conducted by the Imperial Japanese Navy against the United States naval base at Pearl Harbor, Hawaii, on the morning of December 7, 1941...
- During World War IIWorld War IIWorld War II, or the Second World War , was a global conflict lasting from 1939 to 1945, involving most of the world's nations—including all of the great powers—eventually forming two opposing military alliances: the Allies and the Axis...
the British Broadcasting Corporation's overseas service frequently included "personal messages" as part of its regular broadcast schedule. The seemingly nonsensical stream of messages read out by announcers were actually one time codes intended for Special Operations ExecutiveSpecial Operations ExecutiveThe Special Operations Executive was a World War II organisation of the United Kingdom. It was officially formed by Prime Minister Winston Churchill and Minister of Economic Warfare Hugh Dalton on 22 July 1940, to conduct guerrilla warfare against the Axis powers and to instruct and aid local...
(SOE) agents operating behind enemy lines. An example might be "The princess wears red shoes" or "Mimi's cat is asleep under the table". Each code message was read out twice. By such means, the French ResistanceFrench ResistanceThe French Resistance is the name used to denote the collection of French resistance movements that fought against the Nazi German occupation of France and against the collaborationist Vichy régime during World War II...
were instructed to start sabotaging rail and other transport links the night before D-dayD-DayD-Day is a term often used in military parlance to denote the day on which a combat attack or operation is to be initiated. "D-Day" often represents a variable, designating the day upon which some significant event will occur or has occurred; see Military designation of days and hours for similar...
. - "Over all of Spain, the sky is clear" was a signal (broadcast on radio) to start the nationalist military revolt in Spain on July 17, 1936.
Sometimes messages are not prearranged and rely on shared knowledge hopefully known only to the recipients. An example is the telegram sent to U.S. President Harry Truman, then in Potsdam
Potsdam
Potsdam is the capital city of the German federal state of Brandenburg and part of the Berlin/Brandenburg Metropolitan Region. It is situated on the River Havel, southwest of Berlin city centre....
to meet with Soviet premier Joseph Stalin
Joseph Stalin
Joseph Vissarionovich Stalin was the Premier of the Soviet Union from 6 May 1941 to 5 March 1953. He was among the Bolshevik revolutionaries who brought about the October Revolution and had held the position of first General Secretary of the Communist Party of the Soviet Union's Central Committee...
, informing Truman of the first successful test
Trinity test
Trinity was the code name of the first test of a nuclear weapon. This test was conducted by the United States Army on July 16, 1945, in the Jornada del Muerto desert about 35 miles southeast of Socorro, New Mexico, at the new White Sands Proving Ground, which incorporated the Alamogordo Bombing...
of an atomic bomb.
- "Operated on this morning. Diagnosis not yet complete but results seem satisfactory and already exceed expectations. Local press release necessary as interest extends great distance. Dr. GrovesLeslie GrovesLieutenant General Leslie Richard Groves, Jr. was a United States Army Corps of Engineers officer who oversaw the construction of the Pentagon and directed the Manhattan Project that developed the atomic bomb during World War II. As the son of a United States Army chaplain, Groves lived at a...
pleased. He returns tomorrow. I will keep you posted."
See also one-time pad
One-time pad
In cryptography, the one-time pad is a type of encryption, which has been proven to be impossible to crack if used correctly. Each bit or character from the plaintext is encrypted by a modular addition with a bit or character from a secret random key of the same length as the plaintext, resulting...
, an unrelated cypher algorithm
Idiot code
An idiot code is a code that is created by the parties using it. This type of communication is akin to the hand signals used by armies in the field.Example: Any sentence where 'day' and 'night' are used means 'attack'. The location mentioned in the following sentence specifies the location to be attacked.
- Plaintext: Attack Gotham.
- Codetext: We walked day and night through the streets but couldn't find it! Tomorrow we'll head into Gotham.
An early use of the term appears to be by George Perrault, a character in the science fiction book Friday
Friday (novel)
Friday is a 1982 science fiction novel by Robert A. Heinlein. It is the story of a female "artificial person," the titular character, genetically engineered to be stronger, faster, smarter, and generally better than normal humans...
by Robert A. Heinlein
Robert A. Heinlein
Robert Anson Heinlein was an American science fiction writer. Often called the "dean of science fiction writers", he was one of the most influential and controversial authors of the genre. He set a standard for science and engineering plausibility and helped to raise the genre's standards of...
:
- The simplest sort [of code] and thereby impossible to break. The first ad told the person or persons concerned to carry out number seven or expect number seven or it said something about something designated as seven. This one says the same with respect to code item number ten. But the meaning of the numbers cannot be deduced through statistical analysis because the code can be changed long before a useful statistical universe can be reached. It's an idiot code... and an idiot code can never be broken if the user has the good sense not to go too often to the well.
Richard Miniter, author of Losing Bin Laden: How Bill Clinton's Failures Unleashed Global Terror, was quoted in an interview by UPI Technology News:
- Another way terrorists use the Internet to communicate is through conventional message boards. They simply go to common public places online, chat rooms and the like, and post messages using what intelligence operatives call an "idiot code", said Miniter.
Terrorism expert Magnus Ranstorp said that the men who carried out the September 11, 2001, attacks
September 11, 2001 attacks
The September 11 attacks The September 11 attacks The September 11 attacks (also referred to as September 11, September 11th or 9/119/11 is pronounced "nine eleven". The slash is not part of the pronunciation...
on the United States used basic e-mail and what he calls "idiot code" to discuss their plans.
Cryptanalysis of codes
While solving a monoalphabetic substitution cipher is easy, solving even a simple code is difficult. Decrypting a coded message is a little like trying to translate a document written in a foreign language, with the task basically amounting to building up a "dictionary" of the codegroups and the plaintext words they represent.One fingerhold on a simple code is the fact that some words are more common than others, such as "the" or "a" in English. In telegraphic messages, the codegroup for "STOP" (i.e., end of sentence or paragraph) is usually very common. This helps define the structure of the message in terms of sentences, if not their meaning, and this is cryptanalytically useful.
Further progress can be made against a code by collecting many codetexts encrypted with the same code and then using information from other sources
- spies,
- newspapers,
- diplomatic cocktail party chat,
- the location from where a message was sent,
- where it was being sent to (i.e., traffic analysisTraffic analysisTraffic analysis is the process of intercepting and examining messages in order to deduce information from patterns in communication. It can be performed even when the messages are encrypted and cannot be decrypted. In general, the greater the number of messages observed, or even intercepted and...
) - the time the message was sent,
- events occurring before and after the message was sent
- the normal habits of the people sending the coded messages
- etc.
For example, a particular codegroup found almost exclusively in messages from a particular army and nowhere else might very well indicate the commander of that army. A codegroup that appears in messages preceding an attack on a particular location may very well stand for that location.
Of course, cribs can be an immediate giveaway to the definitions of codegroups. As codegroups are determined, they can gradually build up a critical mass, with more and more codegroups revealed from context and educated guesswork. One-part codes are more vulnerable to such educated guesswork than two-part codes, since if the codenumber "26839" of a one-part code is determined to stand for "bulldozer", then the lower codenumber "17598" will likely stand for a plaintext word that starts with "a" or "b". At least, for simple one part codes.
Various tricks can be used to "plant" or "sow" information into a coded message, for example by executing a raid at a particular time and location against an enemy, and then examining code messages sent after the raid. Coding errors are a particularly useful fingerhold into a code; people reliably make errors, sometimes disastrous ones. Of course, planting data and exploiting errors works against ciphers as well.
- The most obvious and, in principle at least, simplest way of cracking a code is to steal the codebook through bribery, burglary, or raiding parties — procedures sometimes glorified by the phrase "practical cryptography" — and this is a weakness for both codes and ciphers, though codebooks are generally larger and used longer than cipher keyKey (cryptography)In cryptography, a key is a piece of information that determines the functional output of a cryptographic algorithm or cipher. Without a key, the algorithm would produce no useful result. In encryption, a key specifies the particular transformation of plaintext into ciphertext, or vice versa...
s. While a good code may be harder to break than a cipher, the need to write and distribute codebooks is seriously troublesome.
Constructing a new code is like building a new language and writing a dictionary for it; it was an especially big job before computers. If a code is compromised, the entire task must be done all over again, and that means a lot of work for both cryptographers and the code users. In practice, when codes were in widespread use, they were usually changed on a periodic basis to frustrate codebreakers, and to limit the useful life of stolen or copied codebooks.
Once codes have been created, codebook distribution is logistically clumsy, and increases chances the code will be compromised. There is a saying that "Three people can keep a secret if two of them are dead," Benjamin Franklin - Wikiquote and though it may be something of an exaggeration, a secret becomes harder to keep if it is shared among several people. Codes can be thought reasonably secure if they are only used by a few careful people, but if whole armies use the same codebook, security becomes much more difficult.
In contrast, the security of ciphers is generally dependent on protecting the cipher keys. Cipher keys can be stolen and people can betray them, but they are much easier to change and distribute.
Superencipherment
In more recent practice, it became typical to encipher a message after first encoding it, so as to provide greater security by increasing the degree of difficulty for cryptanalysts. With a numerical code, this was commonly done with an "additive" - simply a long key number which was digit-by-digit added to the code groups, modulo 10. Unlike the codebooks, additives would be changed frequently. The famous Japanese Navy code, JN-25JN-25
The vulnerability of Japanese naval codes and ciphers was crucial to the conduct of World War II, and had an important influence on foreign relations between Japan and the west in the years leading up to the war as well...
, was of this design, as were several of the (confusingly named) Royal Navy Cyphers used after WWI and into WWII.
One might wonder why a code would be used if it had to be enciphered to provide security. As well as providing security, a well designed code can also compress
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....
the message, and provide some degree of automatic error correction. For ciphers, the same degree of error correction has generally required use of computers.
See also
- CodeCodeA code is a rule for converting a piece of information into another form or representation , not necessarily of the same type....
, its more general communicationCommunicationCommunication is the activity of conveying meaningful information. Communication requires a sender, a message, and an intended recipient, although the receiver need not be present or aware of the sender's intent to communicate at the time of communication; thus communication can occur across vast...
s meaning - List of coding terms
:Category:Encodings
- Trench codeTrench codeIn cryptography, trench codes were codes used for secrecy by field armies in World War I. A reasonably-designed code is generally more difficult to crack than a classical cipher, but of course suffers from the difficulty of preparing, distributing, and protecting codebooks.However, by the middle of...
- JN-25JN-25The vulnerability of Japanese naval codes and ciphers was crucial to the conduct of World War II, and had an important influence on foreign relations between Japan and the west in the years leading up to the war as well...
- Zimmermann telegramZimmermann TelegramThe Zimmermann Telegram was a 1917 diplomatic proposal from the German Empire to Mexico to make war against the United States. The proposal was caught by the British before it could get to Mexico. The revelation angered the Americans and led in part to a U.S...
- Code talkers