Formatted text
Encyclopedia
Formatted text, styled text or rich text, as opposed to plain text
, has styling information beyond the minimum of semantic elements: colours, styles (boldface, italic), sizes and special features (such as hyperlink
s).
, RTF
or enriched text
files, and it may be ASCII-only. Conversely, a plain text
file may be non-ASCII (in an encoding such as Unicode
UTF-8
). Text-only formatted text is achieved by markup
which too is textual, while some editors of formatted text like Microsoft Word
save in a binary format.
Surrounding by underscores was also used for book titles: Look it up in _The_C_Programming_Language_.
Formatting can be marked by tags distinguished from the body text by special characters, such as angle brackets in HTML
. For example, this text:
The dog is classified as Canis lupus familiaris in taxonomy.
is marked up in HTML
thus:
<p>The dog is classified as <i>Canis lupus familiaris</i> in taxonomy.</p>
The italicised text is enclosed by an opening and a closing italics tag. In LaTeX
, the text would be marked up like this:
The dog is classified as \textit{Canis lupus familiaris} in taxonomy.
Markup languages can be implemented with any text editor
, needing no special software.
, the first WYSIWYG
word processor, in which the typist codes the formatting visually rather than by inserting textual markup, word processors have tended to save to binary files. Opening such files with a text editor
reveals the text embellished with various binary characters, either around the formatted areas (eg in WordPerfect
) or separately, at the beginning or end of the file (eg in Microsoft Word
).
Formatted text documents in binary files have, however, the disadvantages of formatting scope and secrecy. Whereas the extent of formatting is accurately marked in markup languages, WYSIWYG
formatting is based on memory, that is, keeping for example your pressing of the boldface button until cancelled. This can lead to formatting mistakes and maintenance troubles. As for secrecy, formatted text document file formats tend to be proprietary and undocumented, leading to difficulty in coding compatibility by third parties, and also to unnecessary upgrades because of version changes.
WordStar
was a popular word processor that did not use binary files with hidden characters.
OpenOffice.org
Writer saves files in an XML
format. However, the resultant file is a binary since it is compressed (a tarball
equivalent).
PDF
is another formatted text file format that is usually binary (using compression for the text, and storing graphics and fonts in binary). It is generally an end-user format, written from an application such as Microsoft Word
or OpenOffice.org
Writer, and not editable by the user once done.
Plain text
In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text....
, has styling information beyond the minimum of semantic elements: colours, styles (boldface, italic), sizes and special features (such as hyperlink
Hyperlink
In computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks...
s).
Terminology
Formatted text cannot rightly be identified with binary files or be distinct from ASCII text. This is because formatted text is not necessarily binary, it may be text-only, such as HTMLHTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, RTF
Rich Text Format
The Rich Text Format is a proprietary document file format with published specification developed by Microsoft Corporation since 1987 for Microsoft products and for cross-platform document interchange....
or enriched text
Enriched text
Enriched text is a formatted text format for e-mail, defined by the IETF in RFC 1896 and associated with the text/enriched MIME type. It is "intended to facilitate the wider interoperation of simple enriched text across a wide variety of hardware and software platforms". Today, enriched text is...
files, and it may be ASCII-only. Conversely, a plain text
Plain text
In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text....
file may be non-ASCII (in an encoding such as Unicode
Unicode
Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...
). Text-only formatted text is achieved by markup
Markup language
A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...
which too is textual, while some editors of formatted text like Microsoft Word
Microsoft Word
Microsoft Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS , the Apple Macintosh , the AT&T Unix PC , Atari ST , SCO UNIX,...
save in a binary format.
Beginning of formatted text
Formatted text has its genesis in the first interactive systems, where users made up for the lack of formatting in ASCII by using certain symbols as substitutes. Emphasis, for example, could be achieved in ASCII in a number of ways:- Capitalization: I am NOT making this up.
- Surrounding with underscores: I am _not_ making this up.
- Surrounding with asterisks: I am *not* making this up.
- Spacing: I am n o t making this up.
Surrounding by underscores was also used for book titles: Look it up in _The_C_Programming_Language_.
Markup languages
- Main article: Markup languages
Formatting can be marked by tags distinguished from the body text by special characters, such as angle brackets in HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
. For example, this text:
The dog is classified as Canis lupus familiaris in taxonomy.
is marked up in HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
thus:
<p>The dog is classified as <i>Canis lupus familiaris</i> in taxonomy.</p>
The italicised text is enclosed by an opening and a closing italics tag. In LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...
, the text would be marked up like this:
The dog is classified as \textit{Canis lupus familiaris} in taxonomy.
Markup languages can be implemented with any text editor
Text editor
A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....
, needing no special software.
Formatted document files
Since the invention of MacWriteMacWrite
MacWrite was a word processor application released along with the first Apple Macintosh systems in 1984. It was the first such program that was widely available to the public to offer WYSIWYG operation, with multiple fonts and styles...
, the first WYSIWYG
WYSIWYG
WYSIWYG is an acronym for What You See Is What You Get. The term is used in computing to describe a system in which content displayed onscreen during editing appears in a form closely corresponding to its appearance when printed or displayed as a finished product...
word processor, in which the typist codes the formatting visually rather than by inserting textual markup, word processors have tended to save to binary files. Opening such files with a text editor
Text editor
A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....
reveals the text embellished with various binary characters, either around the formatted areas (eg in WordPerfect
WordPerfect
WordPerfect is a word processing application, now owned by Corel.Bruce Bastian, a Brigham Young University graduate student, and BYU computer science professor Dr. Alan Ashton joined forces to design a word processing system for the city of Orem's Data General Corp. minicomputer system in 1979...
) or separately, at the beginning or end of the file (eg in Microsoft Word
Microsoft Word
Microsoft Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS , the Apple Macintosh , the AT&T Unix PC , Atari ST , SCO UNIX,...
).
Formatted text documents in binary files have, however, the disadvantages of formatting scope and secrecy. Whereas the extent of formatting is accurately marked in markup languages, WYSIWYG
WYSIWYG
WYSIWYG is an acronym for What You See Is What You Get. The term is used in computing to describe a system in which content displayed onscreen during editing appears in a form closely corresponding to its appearance when printed or displayed as a finished product...
formatting is based on memory, that is, keeping for example your pressing of the boldface button until cancelled. This can lead to formatting mistakes and maintenance troubles. As for secrecy, formatted text document file formats tend to be proprietary and undocumented, leading to difficulty in coding compatibility by third parties, and also to unnecessary upgrades because of version changes.
WordStar
WordStar
WordStar is a word processor application, published by MicroPro International, originally written for the CP/M operating system but later ported to DOS, that enjoyed a dominant market share during the early to mid-1980s. Although Seymour I...
was a popular word processor that did not use binary files with hidden characters.
OpenOffice.org
OpenOffice.org
OpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. OpenOffice is available for a number of different computer operating systems, is distributed as free software...
Writer saves files in an XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
format. However, the resultant file is a binary since it is compressed (a tarball
Tarball
Tarball can refer to:* Tar , a computer file format that can combine multiple files into a single "tarball" file* Tarball , a blob of semi-solid oil found on or near the ocean...
equivalent).
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
is another formatted text file format that is usually binary (using compression for the text, and storing graphics and fonts in binary). It is generally an end-user format, written from an application such as Microsoft Word
Microsoft Word
Microsoft Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS , the Apple Macintosh , the AT&T Unix PC , Atari ST , SCO UNIX,...
or OpenOffice.org
OpenOffice.org
OpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. OpenOffice is available for a number of different computer operating systems, is distributed as free software...
Writer, and not editable by the user once done.
External links
- Word Processors: Stupid and Inefficient by Allin Cottrell (opinion piece)