Caesar cipher
Encyclopedia
In cryptography
, a Caesar cipher, also known as a Caesar's cipher, the shift cipher, Caesar's code or Caesar shift, is one of the simplest and most widely known encryption
techniques. It is a type of substitution cipher
in which each letter in the plaintext
is replaced by a letter some fixed number of positions down the alphabet
. For example, with a shift of 3, A would be replaced by D, B would become E, and so on. The method is named after Julius Caesar
, who used it to communicate with his generals.
The encryption step performed by a Caesar cipher is often incorporated as part of more complex schemes, such as the Vigenère cipher
, and still has modern application in the ROT13
system. As with all single alphabet substitution ciphers, the Caesar cipher is easily broken and in modern practice offers essentially no communication security.
):
Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: DEFGHIJKLMNOPQRSTUVWXYZABC
When encrypting, a person looks up each letter of the message in the "plain" line and writes down the corresponding letter in the "cipher" line. Deciphering is done in reverse.
Ciphertext: WKH TXLFN EURZQ IRA MXPSV RYHU WKH ODCB GRJ
Plaintext: the quick brown fox jumps over the lazy dog
The encryption can also be represented using modular arithmetic
by first transforming the letters into numbers, according to the scheme, A = 0, B = 1,..., Z = 25. Encryption of a letter by a shift n can be described mathematically as,
Decryption is performed similarly,
(There are different definitions for the modulo operation
. In the above, the result is in the range 0...25. I.e., if x+n or x-n are not in the range 0...25, we have to subtract or add 26.)
The replacement remains the same throughout the message, so the cipher is classed as a type of monoalphabetic substitution, as opposed to polyalphabetic substitution.
, who, according to Suetonius
, used it with a shift of three to protect messages of military significance. While Caesar's was the first recorded use of this scheme, other substitution ciphers are known to have been used earlier.
His nephew, Augustus, also used the cipher, but with a right shift of one, and it did not wrap around to the beginning of the alphabet:
There is evidence that Julius Caesar used more complicated systems as well, and one writer, Aulus Gellius
, refers to a (now lost) treatise on his ciphers:
It is unknown how effective the Caesar cipher was at the time, but it is likely to have been reasonably secure, not least because most of Caesar's enemies would have been illiterate
and others would have assumed that the messages were written in an unknown foreign language. There is no record at that time of any techniques for the solution of simple substitution ciphers. The earliest surviving records date to the 9th century works of Al-Kindi
in the Arab
world with the discovery of frequency analysis
.
A Caesar cipher with a shift of one is used on the back of the Mezuzah
to encrypt the names of God
. This may be a holdover from an earlier time when Jewish people were not allowed to have Mezuzahs. The letters of the cryptogram themselves comprise a religiously significant "divine name" which Orthodox
belief holds keeps the forces of evil in check.
In the 19th century, the personal advertisements section in newspapers would sometimes be used to exchange messages encrypted using simple cipher schemes. Kahn (1967) describes instances of lovers engaging in secret communications enciphered using the Caesar cipher in The Times
. Even as late as 1915, the Caesar cipher was in use: the Russian army employed it as a replacement for more complicated ciphers which had proved to be too difficult for their troops to master; German and Austrian cryptanalysts had little difficulty in decrypting their messages.
Caesar ciphers can be found today in children's toys such as secret decoder ring
s. A Caesar shift of thirteen is also performed in the ROT13
algorithm
, a simple method of obfuscating text widely found on Usenet
and used to obscure text (such as joke punchlines and story spoilers
), but not seriously used as a method of encryption.
The Vigenère cipher
uses a Caesar cipher with a different shift at each position in the text; the value of the shift is defined using a repeating keyword. If the keyword is as long as the message, chosen random, never becomes known to anyone else, and is never reused, this is the one-time pad
cipher, proven unbreakable. The conditions are so difficult they are, in practical effect, never achieved. Keywords shorter than the message (e.g., "Complete Victory" used by the Confederacy
during the American Civil War
), introduce a cyclic pattern that might be detected with a statistically advanced version of frequency analysis.
In April 2006, fugitive Mafia
boss Bernardo Provenzano
was captured in Sicily
partly because some of his messages, written in a variation of the Caesar cipher, were broken. Provenzano's cipher used numbers, so that "A" would be written as "4", "B" as "5", and so on.
In 2011, Rajib Karim was convicted in the United Kingdom
of "terrorism offences" after using the Caesar cipher to communicate with Bangladeshi Islamic activists discussing plots to blow up British Airways
planes or disrupt their IT networks. Although the parties had access to far better encryption techniques (Karim himself used PGP
for data storage on computer disks), they chose to use their own scheme instead (implemented in Microsoft Excel
) "because 'kaffirs', or non-believers, know about it [ie, PGP] so it must be less secure".
The Caesar cipher can be easily broken even in a ciphertext-only scenario
. Two situations can be considered:
In the first case, the cipher can be broken using the same techniques as for a general simple substitution cipher, such as frequency analysis
or pattern words. While solving, it is likely that an attacker will quickly notice the regularity in the solution and deduce that a Caesar cipher is the specific algorithm employed.
In the second instance, breaking the scheme is even more straightforward. Since there are only a limited number of possible shifts (26 in English), they can each be tested in turn in a brute force attack
. One way to do this is to write out a snippet of the ciphertext in a table of all possible shifts — a technique sometimes known as "completing the plain component". The example given is for the ciphertext "EXXEGOEXSRGI"; the plaintext is instantly recognisable by eye at a shift of four. Another way of viewing this method is that, under each letter of the ciphertext, the entire alphabet is written out in reverse starting at that letter. This attack can be accelerated using a set of strips prepared with the alphabet written down them in reverse order. The strips are then aligned to form the ciphertext along one row, and the plaintext should appear in one of the other rows.
Another brute force approach is to match up the frequency distribution of the letters. By graphing the frequencies of letters in the ciphertext, and by knowing the expected distribution of those letters in the original language of the plaintext, a human can easily spot the value of the shift by looking at the displacement of particular features of the graph. This is known as frequency analysis
. For example in the English language the plaintext frequencies of the letters E, T, (usually most frequent), and Q, Z (typically least frequent) are particularly distinctive. Computers can also do this by measuring how well the actual frequency distribution matches up with the expected distribution; for example, the chi-squared statistic can be used.
For natural language plaintext, there will, in all likelihood, be only one plausible decryption, although for extremely short plaintexts, multiple candidates are possible. For example, the ciphertext MPQY could, plausibly, decrypt to either "aden
" or "know" (assuming the plaintext is in English); similarly, "ALIIP" to "dolls" or "wheel"; and "AFCCP" to "jolly" or "cheer" (see also unicity distance
).
Multiple encryptions and decryptions provide no additional security. This is because two encryptions of, say, shift A and shift B, will be equivalent to an encryption with shift A + B. In mathematical terms, the encryption under various keys forms a group
.
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
, a Caesar cipher, also known as a Caesar's cipher, the shift cipher, Caesar's code or Caesar shift, is one of the simplest and most widely known encryption
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...
techniques. It is a type of substitution cipher
Substitution cipher
In cryptography, a substitution cipher is a method of encryption by which units of plaintext are replaced with ciphertext according to a regular system; the "units" may be single letters , pairs of letters, triplets of letters, mixtures of the above, and so forth...
in which each letter in the plaintext
Plaintext
In cryptography, plaintext is information a sender wishes to transmit to a receiver. Cleartext is often used as a synonym. Before the computer era, plaintext most commonly meant message text in the language of the communicating parties....
is replaced by a letter some fixed number of positions down the alphabet
Alphabet
An alphabet is a standard set of letters—basic written symbols or graphemes—each of which represents a phoneme in a spoken language, either as it exists now or as it was in the past. There are other systems, such as logographies, in which each character represents a word, morpheme, or semantic...
. For example, with a shift of 3, A would be replaced by D, B would become E, and so on. The method is named after Julius Caesar
Julius Caesar
Gaius Julius Caesar was a Roman general and statesman and a distinguished writer of Latin prose. He played a critical role in the gradual transformation of the Roman Republic into the Roman Empire....
, who used it to communicate with his generals.
The encryption step performed by a Caesar cipher is often incorporated as part of more complex schemes, such as the Vigenère cipher
Vigenère cipher
The Vigenère cipher is a method of encrypting alphabetic text by using a series of different Caesar ciphers based on the letters of a keyword. It is a simple form of polyalphabetic substitution....
, and still has modern application in the ROT13
ROT13
ROT13 is a simple substitution cipher used in online forums as a means of hiding spoilers, punchlines, puzzle solutions, and offensive materials from the casual glance. ROT13 has been described as the "Usenet equivalent of a magazine printing the answer to a quiz upside down"...
system. As with all single alphabet substitution ciphers, the Caesar cipher is easily broken and in modern practice offers essentially no communication security.
Example
The transformation can be represented by aligning two alphabets; the cipher alphabet is the plain alphabet rotated left or right by some number of positions. For instance, here is a Caesar cipher using a left rotation of three places (the shift parameter, here 3, is used as the keyKey (cryptography)
In cryptography, a key is a piece of information that determines the functional output of a cryptographic algorithm or cipher. Without a key, the algorithm would produce no useful result. In encryption, a key specifies the particular transformation of plaintext into ciphertext, or vice versa...
):
Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: DEFGHIJKLMNOPQRSTUVWXYZABC
When encrypting, a person looks up each letter of the message in the "plain" line and writes down the corresponding letter in the "cipher" line. Deciphering is done in reverse.
Ciphertext: WKH TXLFN EURZQ IRA MXPSV RYHU WKH ODCB GRJ
Plaintext: the quick brown fox jumps over the lazy dog
The encryption can also be represented using modular arithmetic
Modular arithmetic
In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value—the modulus....
by first transforming the letters into numbers, according to the scheme, A = 0, B = 1,..., Z = 25. Encryption of a letter by a shift n can be described mathematically as,
Decryption is performed similarly,
(There are different definitions for the modulo operation
Modulo operation
In computing, the modulo operation finds the remainder of division of one number by another.Given two positive numbers, and , a modulo n can be thought of as the remainder, on division of a by n...
. In the above, the result is in the range 0...25. I.e., if x+n or x-n are not in the range 0...25, we have to subtract or add 26.)
The replacement remains the same throughout the message, so the cipher is classed as a type of monoalphabetic substitution, as opposed to polyalphabetic substitution.
History and usage
The Caesar cipher is named after Julius CaesarJulius Caesar
Gaius Julius Caesar was a Roman general and statesman and a distinguished writer of Latin prose. He played a critical role in the gradual transformation of the Roman Republic into the Roman Empire....
, who, according to Suetonius
Lives of the Twelve Caesars
De vita Caesarum commonly known as The Twelve Caesars, is a set of twelve biographies of Julius Caesar and the first 11 emperors of the Roman Empire written by Gaius Suetonius Tranquillus.The work, written in AD 121 during the reign of the emperor Hadrian, was the most popular work of Suetonius,...
, used it with a shift of three to protect messages of military significance. While Caesar's was the first recorded use of this scheme, other substitution ciphers are known to have been used earlier.
His nephew, Augustus, also used the cipher, but with a right shift of one, and it did not wrap around to the beginning of the alphabet:
There is evidence that Julius Caesar used more complicated systems as well, and one writer, Aulus Gellius
Aulus Gellius
Aulus Gellius , was a Latin author and grammarian, who was probably born and certainly brought up in Rome. He was educated in Athens, after which he returned to Rome, where he held a judicial office...
, refers to a (now lost) treatise on his ciphers:
It is unknown how effective the Caesar cipher was at the time, but it is likely to have been reasonably secure, not least because most of Caesar's enemies would have been illiterate
Literacy
Literacy has traditionally been described as the ability to read for knowledge, write coherently and think critically about printed material.Literacy represents the lifelong, intellectual process of gaining meaning from print...
and others would have assumed that the messages were written in an unknown foreign language. There is no record at that time of any techniques for the solution of simple substitution ciphers. The earliest surviving records date to the 9th century works of Al-Kindi
Al-Kindi
' , known as "the Philosopher of the Arabs", was a Muslim Arab philosopher, mathematician, physician, and musician. Al-Kindi was the first of the Muslim peripatetic philosophers, and is unanimously hailed as the "father of Islamic or Arabic philosophy" for his synthesis, adaptation and promotion...
in the Arab
Arab
Arab people, also known as Arabs , are a panethnicity primarily living in the Arab world, which is located in Western Asia and North Africa. They are identified as such on one or more of genealogical, linguistic, or cultural grounds, with tribal affiliations, and intra-tribal relationships playing...
world with the discovery of frequency analysis
Frequency analysis
In cryptanalysis, frequency analysis is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers....
.
A Caesar cipher with a shift of one is used on the back of the Mezuzah
Mezuzah
A mezuzah is usually a metal or wooden rectangular object that is fastened to a doorpost of a Jewish house. Inside it is a piece of parchment inscribed with specified Hebrew verses from the Torah...
to encrypt the names of God
Names of God in Judaism
In Judaism, the name of God is more than a distinguishing title; it represents the Jewish conception of the divine nature, and of the relationship of God to the Jewish people and to the world. To demonstrate the sacredness of the names of God, and as a means of showing respect and reverence for...
. This may be a holdover from an earlier time when Jewish people were not allowed to have Mezuzahs. The letters of the cryptogram themselves comprise a religiously significant "divine name" which Orthodox
Orthodox Judaism
Orthodox Judaism , is the approach to Judaism which adheres to the traditional interpretation and application of the laws and ethics of the Torah as legislated in the Talmudic texts by the Sanhedrin and subsequently developed and applied by the later authorities known as the Gaonim, Rishonim, and...
belief holds keeps the forces of evil in check.
In the 19th century, the personal advertisements section in newspapers would sometimes be used to exchange messages encrypted using simple cipher schemes. Kahn (1967) describes instances of lovers engaging in secret communications enciphered using the Caesar cipher in The Times
The Times
The Times is a British daily national newspaper, first published in London in 1785 under the title The Daily Universal Register . The Times and its sister paper The Sunday Times are published by Times Newspapers Limited, a subsidiary since 1981 of News International...
. Even as late as 1915, the Caesar cipher was in use: the Russian army employed it as a replacement for more complicated ciphers which had proved to be too difficult for their troops to master; German and Austrian cryptanalysts had little difficulty in decrypting their messages.
Caesar ciphers can be found today in children's toys such as secret decoder ring
Secret decoder ring
A secret decoder ring is a device which allows one to decode a simple substitution cipher - or to encrypt a message by working in the opposite direction....
s. A Caesar shift of thirteen is also performed in the ROT13
ROT13
ROT13 is a simple substitution cipher used in online forums as a means of hiding spoilers, punchlines, puzzle solutions, and offensive materials from the casual glance. ROT13 has been described as the "Usenet equivalent of a magazine printing the answer to a quiz upside down"...
algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
, a simple method of obfuscating text widely found on Usenet
Usenet
Usenet is a worldwide distributed Internet discussion system. It developed from the general purpose UUCP architecture of the same name.Duke University graduate students Tom Truscott and Jim Ellis conceived the idea in 1979 and it was established in 1980...
and used to obscure text (such as joke punchlines and story spoilers
Spoiler (media)
Spoiler is slang for any element of any summary or description of any piece of fiction that reveals any plot element which will give away the outcome of a dramatic episode within the work of fiction, or the conclusion of the entire work. It can also be used to refer to any piece of information...
), but not seriously used as a method of encryption.
The Vigenère cipher
Vigenère cipher
The Vigenère cipher is a method of encrypting alphabetic text by using a series of different Caesar ciphers based on the letters of a keyword. It is a simple form of polyalphabetic substitution....
uses a Caesar cipher with a different shift at each position in the text; the value of the shift is defined using a repeating keyword. If the keyword is as long as the message, chosen random, never becomes known to anyone else, and is never reused, this is the one-time pad
One-time pad
In cryptography, the one-time pad is a type of encryption, which has been proven to be impossible to crack if used correctly. Each bit or character from the plaintext is encrypted by a modular addition with a bit or character from a secret random key of the same length as the plaintext, resulting...
cipher, proven unbreakable. The conditions are so difficult they are, in practical effect, never achieved. Keywords shorter than the message (e.g., "Complete Victory" used by the Confederacy
Confederate States of America
The Confederate States of America was a government set up from 1861 to 1865 by 11 Southern slave states of the United States of America that had declared their secession from the U.S...
during the American Civil War
American Civil War
The American Civil War was a civil war fought in the United States of America. In response to the election of Abraham Lincoln as President of the United States, 11 southern slave states declared their secession from the United States and formed the Confederate States of America ; the other 25...
), introduce a cyclic pattern that might be detected with a statistically advanced version of frequency analysis.
In April 2006, fugitive Mafia
Mafia
The Mafia is a criminal syndicate that emerged in the mid-nineteenth century in Sicily, Italy. It is a loose association of criminal groups that share a common organizational structure and code of conduct, and whose common enterprise is protection racketeering...
boss Bernardo Provenzano
Bernardo Provenzano
Bernardo Provenzano is a member of the Sicilian Mafia and is suspected of having been the head of the Corleonesi, a Mafia faction that originated in the village of Corleone, and de facto capo di tutti capi of the entire Sicilian Mafia until his arrest in 2006.His nickname is Binnu u tratturi...
was captured in Sicily
Sicily
Sicily is a region of Italy, and is the largest island in the Mediterranean Sea. Along with the surrounding minor islands, it constitutes an autonomous region of Italy, the Regione Autonoma Siciliana Sicily has a rich and unique culture, especially with regard to the arts, music, literature,...
partly because some of his messages, written in a variation of the Caesar cipher, were broken. Provenzano's cipher used numbers, so that "A" would be written as "4", "B" as "5", and so on.
In 2011, Rajib Karim was convicted in the United Kingdom
United Kingdom
The United Kingdom of Great Britain and Northern IrelandIn the United Kingdom and Dependencies, other languages have been officially recognised as legitimate autochthonous languages under the European Charter for Regional or Minority Languages...
of "terrorism offences" after using the Caesar cipher to communicate with Bangladeshi Islamic activists discussing plots to blow up British Airways
British Airways
British Airways is the flag carrier airline of the United Kingdom, based in Waterside, near its main hub at London Heathrow Airport. British Airways is the largest airline in the UK based on fleet size, international flights and international destinations...
planes or disrupt their IT networks. Although the parties had access to far better encryption techniques (Karim himself used PGP
Pretty Good Privacy
Pretty Good Privacy is a data encryption and decryption computer program that provides cryptographic privacy and authentication for data communication. PGP is often used for signing, encrypting and decrypting texts, E-mails, files, directories and whole disk partitions to increase the security...
for data storage on computer disks), they chose to use their own scheme instead (implemented in Microsoft Excel
Microsoft Excel
Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...
) "because 'kaffirs', or non-believers, know about it [ie, PGP] so it must be less secure".
Breaking the cipher
Decryption shift |
Candidate plaintext |
---|---|
0 | exxegoexsrgi |
1 | dwwdfndwrqfh |
2 | cvvcemcvqpeg |
3 | buubdlbupodf |
4 | attackatonce |
5 | zsszbjzsnmbd |
6 | yrryaiyrmlac |
... | |
23 | haahjrhavujl |
24 | gzzgiqgzutik |
25 | fyyfhpfytshj |
The Caesar cipher can be easily broken even in a ciphertext-only scenario
Ciphertext-only attack
In cryptography, a ciphertext-only attack or known ciphertext attack is an attack model for cryptanalysis where the attacker is assumed to have access only to a set of ciphertexts....
. Two situations can be considered:
- an attacker knows (or guesses) that some sort of simple substitution cipher has been used, but not specifically that it is a Caesar scheme;
- an attacker knows that a Caesar cipher is in use, but does not know the shift value.
In the first case, the cipher can be broken using the same techniques as for a general simple substitution cipher, such as frequency analysis
Frequency analysis
In cryptanalysis, frequency analysis is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers....
or pattern words. While solving, it is likely that an attacker will quickly notice the regularity in the solution and deduce that a Caesar cipher is the specific algorithm employed.
In the second instance, breaking the scheme is even more straightforward. Since there are only a limited number of possible shifts (26 in English), they can each be tested in turn in a brute force attack
Brute force attack
In cryptography, a brute-force attack, or exhaustive key search, is a strategy that can, in theory, be used against any encrypted data. Such an attack might be utilized when it is not possible to take advantage of other weaknesses in an encryption system that would make the task easier...
. One way to do this is to write out a snippet of the ciphertext in a table of all possible shifts — a technique sometimes known as "completing the plain component". The example given is for the ciphertext "EXXEGOEXSRGI"; the plaintext is instantly recognisable by eye at a shift of four. Another way of viewing this method is that, under each letter of the ciphertext, the entire alphabet is written out in reverse starting at that letter. This attack can be accelerated using a set of strips prepared with the alphabet written down them in reverse order. The strips are then aligned to form the ciphertext along one row, and the plaintext should appear in one of the other rows.
Another brute force approach is to match up the frequency distribution of the letters. By graphing the frequencies of letters in the ciphertext, and by knowing the expected distribution of those letters in the original language of the plaintext, a human can easily spot the value of the shift by looking at the displacement of particular features of the graph. This is known as frequency analysis
Frequency analysis
In cryptanalysis, frequency analysis is the study of the frequency of letters or groups of letters in a ciphertext. The method is used as an aid to breaking classical ciphers....
. For example in the English language the plaintext frequencies of the letters E, T, (usually most frequent), and Q, Z (typically least frequent) are particularly distinctive. Computers can also do this by measuring how well the actual frequency distribution matches up with the expected distribution; for example, the chi-squared statistic can be used.
For natural language plaintext, there will, in all likelihood, be only one plausible decryption, although for extremely short plaintexts, multiple candidates are possible. For example, the ciphertext MPQY could, plausibly, decrypt to either "aden
Aden
Aden is a seaport city in Yemen, located by the eastern approach to the Red Sea , some 170 kilometres east of Bab-el-Mandeb. Its population is approximately 800,000. Aden's ancient, natural harbour lies in the crater of an extinct volcano which now forms a peninsula, joined to the mainland by a...
" or "know" (assuming the plaintext is in English); similarly, "ALIIP" to "dolls" or "wheel"; and "AFCCP" to "jolly" or "cheer" (see also unicity distance
Unicity distance
In cryptography, unicity distance is the length of an original ciphertext needed to break the cipher by reducing the number of possible spurious keys to zero in a brute force attack. That is, after trying every possible key, there should be just one decipherment that makes sense, i.e...
).
Multiple encryptions and decryptions provide no additional security. This is because two encryptions of, say, shift A and shift B, will be equivalent to an encryption with shift A + B. In mathematical terms, the encryption under various keys forms a group
Group (mathematics)
In mathematics, a group is an algebraic structure consisting of a set together with an operation that combines any two of its elements to form a third element. To qualify as a group, the set and the operation must satisfy a few conditions called group axioms, namely closure, associativity, identity...
.