Plaintext
Encyclopedia
In cryptography
, plaintext is information a sender wishes to transmit to a receiver. Cleartext is often used as a synonym
. Before the computer era, plaintext most commonly meant message text in the language of the communicating parties.
Plaintext has reference to the operation of cryptographic
algorithms, usually encryption
algorithms, and is the input upon which they operate. Cleartext, by contrast, refers to data that is transmitted or stored unencrypted (that is, 'in the clear').
Since computers became commonly available, the definition has also encompassed not only electronic representations of the traditional text, for instance, messages (e.g., email) and document content (e.g., word processor files), but also the computer representations of sound (e.g., speech or music), images (e.g., photos or videos), ATM and credit card transaction information, sensor data, and so forth. Few of these are directly meaningful to humans, being already transformed into computer manipulable forms. Basically, any information which the communicating parties wish to conceal from others can now be treated, and referred to, as plaintext. Thus, in a significant sense, plaintext is the 'normal' representation of data before any action has been taken to conceal, compress, or 'digest' it. It need not represent text, and even if it does, the text may not be "plain".
Plaintext is used as input to an encryption algorithm; the output is usually termed ciphertext
particularly when the algorithm is a cipher
. Codetext is less often used, and almost always only when the algorithm involved is actually a code
. In some systems, however, multiple layers of encryption
are used, in which case the output of one encryption algorithm becomes plaintext input for the next.
, weaknesses can be introduced through insecure handling of plaintext, allowing an attacker to bypass the cryptography altogether. Plaintext is vulnerable in use and in storage, whether in electronic or paper format. Physical security
deals with methods of securing information and its storage media from local, physical, attacks. For instance, an attacker might enter a poorly secured building and attempt to open locked desk drawers or safe
s. An attacker can also engage in dumpster diving
, and may be able to reconstruct shredded information if it is sufficiently valuable to be worth the effort. One countermeasure is to burn or thoroughly crosscut shred
discarded printed plaintexts or storage media; NSA is infamous for its disposal security precautions.
If plaintext is stored in a computer file
(and the situation of automatically made backup files generated during program execution must be included here, even if invisible to the user), the storage media along with the entire computer and its components must be secure. Sensitive data is sometimes processed on computers whose mass storage is removable, in which case physical security of the removed disk is separately vital. In the case of securing a computer, useful (as opposed to handwaving
) security must be physical (e.g., against burglary
, brazen removal under cover of supposed repair, installation of covert monitoring devices, etc.), as well as virtual (e.g., operating system
modification, illicit network access, Trojan
programs, ...). The wide availability of keydrives, which can plug into most modern computers and store large quantities of data, poses another severe security headache. A spy (perhaps posing as a cleaning person) could easily conceal one and even swallow it, if necessary.
Discarded computers, disk drives and media are also a potential source of plaintexts. Most operating systems do not actually erase anything — they simply mark the disk space occupied by a deleted file as 'available for use', and remove its entry from the file system directory
. The information in a file deleted in this way remains fully present until overwritten at some later time when the operating system reuses the disk space. With even low-end computers commonly sold with many gigabytes of disk space and rising monthly, this 'later time' may be months later, or never. Even overwriting the portion of a disk surface occupied by a deleted file is insufficient in many cases. Peter Gutmann
of the University of Auckland
wrote a celebrated 1996 paper on the recovery of overwritten information from magnetic disks; areal storage densities have gotten much higher since then, so this sort of recovery is likely to be more difficult than it was when Gutmann wrote.
Also, independently, modern hard drives automatically remap sectors that are starting to fail; those sectors no longer in use will contain information that is entirely invisible to the file system (and all software which uses it for access to disk data), but is nonetheless still present on the physical drive platter. It may, of course, be sensitive plaintext. Some government agencies (e.g., NSA) require that all disk drives be physically pulverized when they are discarded, and in some cases, chemically treated with corrosives before or after. This practice is not widespread outside of the government, however. For example, Garfinkel and Shelat (2003) analyzed 158 second-hand hard drives acquired at garage sales and the like and found that less than 10% had been sufficiently sanitized. A wide variety of personal and confidential information was found readable from the others. See data remanence
.
Laptop computers are a special problem. The US State Department, the British Secret Service, and the US Department of Defense have all had laptops containing secret information,some perhaps in plaintext form, 'vanish' in recent years. Announcements of similar losses are becoming a common item in news reports. Disk encryption
techniques can provide protection against such loss or theft — if properly chosen and used.
On occasion, even when the data on the host systems is itself encrypted, the media used to transfer data between such systems is nevertheless plaintext due to poorly designed data policy. An incident in October 2007 in which HM Revenue and Customs lost CDs containing no less than the records of 25 million child benefit recipients in the United Kingdom — the data apparently being entirely unencrypted — is a case in point.
Modern cryptographic systems are designed to resist known plaintext or even chosen plaintext attacks and so may not be entirely compromised when plaintext is lost or stolen. Older systems used techniques such as padding
and Russian copulation
to obscure information in plaintext that could be easily guessed, and to resist the effects of loss of plaintext on the security of the cryptosystem.
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
, plaintext is information a sender wishes to transmit to a receiver. Cleartext is often used as a synonym
Synonym
Synonyms are different words with almost identical or similar meanings. Words that are synonyms are said to be synonymous, and the state of being a synonym is called synonymy. The word comes from Ancient Greek syn and onoma . The words car and automobile are synonyms...
. Before the computer era, plaintext most commonly meant message text in the language of the communicating parties.
Plaintext has reference to the operation of cryptographic
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
algorithms, usually encryption
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...
algorithms, and is the input upon which they operate. Cleartext, by contrast, refers to data that is transmitted or stored unencrypted (that is, 'in the clear').
Since computers became commonly available, the definition has also encompassed not only electronic representations of the traditional text, for instance, messages (e.g., email) and document content (e.g., word processor files), but also the computer representations of sound (e.g., speech or music), images (e.g., photos or videos), ATM and credit card transaction information, sensor data, and so forth. Few of these are directly meaningful to humans, being already transformed into computer manipulable forms. Basically, any information which the communicating parties wish to conceal from others can now be treated, and referred to, as plaintext. Thus, in a significant sense, plaintext is the 'normal' representation of data before any action has been taken to conceal, compress, or 'digest' it. It need not represent text, and even if it does, the text may not be "plain".
Plaintext is used as input to an encryption algorithm; the output is usually termed ciphertext
Ciphertext
In cryptography, ciphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher...
particularly when the algorithm is a cipher
Cipher
In cryptography, a cipher is an algorithm for performing encryption or decryption — a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. In non-technical usage, a “cipher” is the same thing as a “code”; however, the concepts...
. Codetext is less often used, and almost always only when the algorithm involved is actually a code
Code
A code is a rule for converting a piece of information into another form or representation , not necessarily of the same type....
. In some systems, however, multiple layers of encryption
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...
are used, in which case the output of one encryption algorithm becomes plaintext input for the next.
Secure handling of plaintext
In a cryptosystemCryptosystem
There are two different meanings of the word cryptosystem. One is used by the cryptographic community, while the other is the meaning understood by the public.- General meaning :...
, weaknesses can be introduced through insecure handling of plaintext, allowing an attacker to bypass the cryptography altogether. Plaintext is vulnerable in use and in storage, whether in electronic or paper format. Physical security
Physical security
Physical security describes measures that are designed to deny access to unauthorized personnel from physically accessing a building, facility, resource, or stored information; and guidance on how to design structures to resist potentially hostile acts...
deals with methods of securing information and its storage media from local, physical, attacks. For instance, an attacker might enter a poorly secured building and attempt to open locked desk drawers or safe
Safe
A safe is a secure lockable box used for securing valuable objects against theft or damage. A safe is usually a hollow cuboid or cylinder, with one face removable or hinged to form a door. The body and door may be cast from metal or formed out of plastic through blow molding...
s. An attacker can also engage in dumpster diving
Dumpster diving
Dumpster diving is the practice of sifting through commercial or residential trash to find items that have been discarded by their owners, but that may be useful to the dumpster diver.-Etymology and alternate names:...
, and may be able to reconstruct shredded information if it is sufficiently valuable to be worth the effort. One countermeasure is to burn or thoroughly crosscut shred
Paper shredder
A paper shredder is a mechanical device used to cut paper into chad, typically either strips or fine particles. Government organizations, businesses, and private individuals use shredders to destroy private, confidential, or otherwise sensitive documents...
discarded printed plaintexts or storage media; NSA is infamous for its disposal security precautions.
If plaintext is stored in a computer file
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...
(and the situation of automatically made backup files generated during program execution must be included here, even if invisible to the user), the storage media along with the entire computer and its components must be secure. Sensitive data is sometimes processed on computers whose mass storage is removable, in which case physical security of the removed disk is separately vital. In the case of securing a computer, useful (as opposed to handwaving
Handwaving
Handwaving is a pejorative label applied to the action of displaying the appearance of doing something, when actually doing little, or nothing. For example, it is applied to debate techniques that involve logical fallacies. It is also used in working situations where productive work is expected,...
) security must be physical (e.g., against burglary
Burglary
Burglary is a crime, the essence of which is illicit entry into a building for the purposes of committing an offense. Usually that offense will be theft, but most jurisdictions specify others which fall within the ambit of burglary...
, brazen removal under cover of supposed repair, installation of covert monitoring devices, etc.), as well as virtual (e.g., operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
modification, illicit network access, Trojan
Trojan horse (computing)
A Trojan horse, or Trojan, is software that appears to perform a desirable function for the user prior to run or install, but steals information or harms the system. The term is derived from the Trojan Horse story in Greek mythology.-Malware:A destructive program that masquerades as a benign...
programs, ...). The wide availability of keydrives, which can plug into most modern computers and store large quantities of data, poses another severe security headache. A spy (perhaps posing as a cleaning person) could easily conceal one and even swallow it, if necessary.
Discarded computers, disk drives and media are also a potential source of plaintexts. Most operating systems do not actually erase anything — they simply mark the disk space occupied by a deleted file as 'available for use', and remove its entry from the file system directory
Directory (file systems)
In computing, a folder, directory, catalog, or drawer, is a virtual container originally derived from an earlier Object-oriented programming concept by the same name within a digital file system, in which groups of computer files and other folders can be kept and organized.A typical file system may...
. The information in a file deleted in this way remains fully present until overwritten at some later time when the operating system reuses the disk space. With even low-end computers commonly sold with many gigabytes of disk space and rising monthly, this 'later time' may be months later, or never. Even overwriting the portion of a disk surface occupied by a deleted file is insufficient in many cases. Peter Gutmann
Peter Gutmann (computer scientist)
Peter Gutmann is a computer scientist in the Department of Computer Science at the University of Auckland, Auckland, New Zealand. He has a Ph.D. in computer science from the University of Auckland. His Ph.D. thesis and a book based on the thesis were about a cryptographic security architecture...
of the University of Auckland
University of Auckland
The University of Auckland is a university located in Auckland, New Zealand. It is the largest university in the country and the highest ranked in the 2011 QS World University Rankings, having been ranked worldwide...
wrote a celebrated 1996 paper on the recovery of overwritten information from magnetic disks; areal storage densities have gotten much higher since then, so this sort of recovery is likely to be more difficult than it was when Gutmann wrote.
Also, independently, modern hard drives automatically remap sectors that are starting to fail; those sectors no longer in use will contain information that is entirely invisible to the file system (and all software which uses it for access to disk data), but is nonetheless still present on the physical drive platter. It may, of course, be sensitive plaintext. Some government agencies (e.g., NSA) require that all disk drives be physically pulverized when they are discarded, and in some cases, chemically treated with corrosives before or after. This practice is not widespread outside of the government, however. For example, Garfinkel and Shelat (2003) analyzed 158 second-hand hard drives acquired at garage sales and the like and found that less than 10% had been sufficiently sanitized. A wide variety of personal and confidential information was found readable from the others. See data remanence
Data remanence
Data remanence is the residual representation of data that remains even after attempts have been made to remove or erase the data. This residue may result from data being left intact by a nominal file deletion operation, by reformatting of storage media that does not remove data previously written...
.
Laptop computers are a special problem. The US State Department, the British Secret Service, and the US Department of Defense have all had laptops containing secret information,some perhaps in plaintext form, 'vanish' in recent years. Announcements of similar losses are becoming a common item in news reports. Disk encryption
Disk encryption
Disk encryption is a special case of data at rest protection when the storage media is a sector-addressable device . This article presents cryptographic aspects of the problem...
techniques can provide protection against such loss or theft — if properly chosen and used.
On occasion, even when the data on the host systems is itself encrypted, the media used to transfer data between such systems is nevertheless plaintext due to poorly designed data policy. An incident in October 2007 in which HM Revenue and Customs lost CDs containing no less than the records of 25 million child benefit recipients in the United Kingdom — the data apparently being entirely unencrypted — is a case in point.
Modern cryptographic systems are designed to resist known plaintext or even chosen plaintext attacks and so may not be entirely compromised when plaintext is lost or stolen. Older systems used techniques such as padding
Padding (cryptography)
-Classical cryptography:Official messages often start and end in predictable ways: My dear ambassador, Weather report, Sincerely yours, etc. The primary use of padding with classical ciphers is to prevent the cryptanalyst from using that predictability to find cribs that aid in breaking the...
and Russian copulation
Russian copulation
In cryptography, Russian copulation is a method of rearranging plaintext before encryption so as to conceal stereotyped headers, salutations, introductions, endings, signatures, etc...
to obscure information in plaintext that could be easily guessed, and to resist the effects of loss of plaintext on the security of the cryptosystem.