YEnc
Encyclopedia
yEnc is a binary-to-text encoding scheme for transferring binary files in messages on Usenet
or via e-mail
. It reduces the overhead
over previous US-ASCII
-based encoding methods by using an 8-bit
Extended ASCII
encoding method. yEnc's overhead is often (if each byte value appears approximately with the same frequency on average) as little as 1–2% (see http://www.yenc.org/yenc-draft.1.3.txt), compared to 33%–40% overhead for 6-bit encoding methods like uuencode
and Base64
.
With decreased overhead, the encoded message body is smaller. Therefore, the message can be delivered faster and requires less storage space.
An additional advantage of yEnc over previous encoding methods, such as uuencode
and Base64
, is the inclusion of a CRC
checksum to verify that the decoded file has been delivered intact.
yEnc was created and released into the public domain by Jürgen Helbing in 2001.
The RFCs that define Internet
messages still require that carriage returns and line feeds have special meaning in a mail message. Therefore, yEnc escapes
the carriage return and line feed characters in the encoded body.
There is no RFC
or other standards document describing yEnc. The yEnc homepage contains a draft informal specification and a grammar (which contradict RFC 2822 and RFC 2045), although neither has been submitted to the Internet Engineering Task Force
.
As with uuencoding, despite its flaws, yEnc remains active and effective on Usenet. The yEnc homepage states that "all major newsreaders have been extended to yEnc support". Microsoft
's Outlook Express
and Mozilla Thunderbird
are popular email clients that can be used as newsreaders. Neither provides direct yEnc support for either news or mail, but there are plugins available.
(which addressed the same flaws in uuencode). For example, yEnc requires the strings "=ybegin" and "=yend" to be placed around the encoded file in the message body. Although this is an improvement over uuencode's "begin" and "end", which occur more frequently in normal text, message readers can still encounter attachments where those strings are present (most frequently in discussions about yEnc itself). yEnc and uuencode also attempt to reassemble files split into multiple messages by using the subject line, which is unreliable.
Moreover, yEnc adds a few new flaws of its own. It attempts to turn unstructured fields into structured ones, which is unreliable, given that no constraints can be placed upon the unstructured use of the fields by non-yEnc uses. Most notably, the subject line of the message is supposed to contain the string "yEnc", the filename, and the part number. (The yEnc homepage chastises yEnc article posters for themselves not observing these constraints.) MIME places all such information in the message headers, which is far more reliable.
Uuencode was careful to support Internet messages as streams of text, which yEnc does not support. Software that supports yEnc encoding must know the size of the original file in advance, because the file size is specified in the yEnc header that precedes the encoded file.
Not all transports can handle the 8-bit characters employed by yEnc, which may cause data corruption. yEnc can also be mangled by different character sets. It works poorly with the increasingly popular UTF-8
character set, for instance. Moreover, some article transports may, on the grounds of enforcing compliance with the Internet message format standard, automatically convert any message using 8-bit characters to either Base64 or quoted-printable
, entirely nullifying the overhead advantage.
Critics also take issue with the lack of formal standardization.
Some people, including yEnc's creator, have suggested including yEnc as part of MIME, which would solve nearly all of its problems and retain the low encoding overhead. However, , no formal or informal standard has been reached.
Usenet
Usenet is a worldwide distributed Internet discussion system. It developed from the general purpose UUCP architecture of the same name.Duke University graduate students Tom Truscott and Jim Ellis conceived the idea in 1979 and it was established in 1980...
or via e-mail
E-mail
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...
. It reduces the overhead
Computational overhead
In computer science, overhead is generally considered any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to attain a particular goal...
over previous US-ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
-based encoding methods by using an 8-bit
8-bit
The first widely adopted 8-bit microprocessor was the Intel 8080, being used in many hobbyist computers of the late 1970s and early 1980s, often running the CP/M operating system. The Zilog Z80 and the Motorola 6800 were also used in similar computers...
Extended ASCII
Extended ASCII
The term extended ASCII describes eight-bit or larger character encodings that include the standard seven-bit ASCII characters as well as others...
encoding method. yEnc's overhead is often (if each byte value appears approximately with the same frequency on average) as little as 1–2% (see http://www.yenc.org/yenc-draft.1.3.txt), compared to 33%–40% overhead for 6-bit encoding methods like uuencode
Uuencode
Uuencoding is a form of binary-to-text encoding that originated in the Unix program uuencode, for encoding binary data for transmission over the uucp mail system.The name "uuencoding" is derived from "Unix-to-Unix encoding"...
and Base64
Base64
Base64 is a group of similar encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation...
.
With decreased overhead, the encoded message body is smaller. Therefore, the message can be delivered faster and requires less storage space.
An additional advantage of yEnc over previous encoding methods, such as uuencode
Uuencode
Uuencoding is a form of binary-to-text encoding that originated in the Unix program uuencode, for encoding binary data for transmission over the uucp mail system.The name "uuencoding" is derived from "Unix-to-Unix encoding"...
and Base64
Base64
Base64 is a group of similar encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation...
, is the inclusion of a CRC
Cyclic redundancy check
A cyclic redundancy check is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data...
checksum to verify that the decoded file has been delivered intact.
yEnc was created and released into the public domain by Jürgen Helbing in 2001.
How yEnc works
Usenet and email message bodies were intended to contain only ASCII characters (RFC 822 or RFC 2822). Most competing encodings represent binary files by converting them into printable ASCII characters, because the range of printable ASCII characters is supported by most operating systems. However, since this reduces the available character set considerably, there is significant overhead (wasted bandwidth) over 8bit-byte networks. For example, in uuencode and Base64, three bytes of data are encoded into four printable ASCII characters, which equals four bytes, a 33% overhead (not including the overhead from headers). yEnc uses one character (one byte) to represent one byte of the file, with a few exceptions.The RFCs that define Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
messages still require that carriage returns and line feeds have special meaning in a mail message. Therefore, yEnc escapes
Escape character
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...
the carriage return and line feed characters in the encoded body.
There is no RFC
Request for Comments
In computer network engineering, a Request for Comments is a memorandum published by the Internet Engineering Task Force describing methods, behaviors, research, or innovations applicable to the working of the Internet and Internet-connected systems.Through the Internet Society, engineers and...
or other standards document describing yEnc. The yEnc homepage contains a draft informal specification and a grammar (which contradict RFC 2822 and RFC 2045), although neither has been submitted to the Internet Engineering Task Force
Internet Engineering Task Force
The Internet Engineering Task Force develops and promotes Internet standards, cooperating closely with the W3C and ISO/IEC standards bodies and dealing in particular with standards of the TCP/IP and Internet protocol suite...
.
As with uuencoding, despite its flaws, yEnc remains active and effective on Usenet. The yEnc homepage states that "all major newsreaders have been extended to yEnc support". Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
's Outlook Express
Outlook Express
Outlook Express is an email and news client that is included with Internet Explorer versions 4.0 through 6.0. As such, it is also bundled with several versions of Microsoft Windows, from Windows 98 to Windows Server 2003, and is available for Windows 3.x, Windows NT 3.51, Windows 95 and Mac OS 9...
and Mozilla Thunderbird
Mozilla Thunderbird
Mozilla Thunderbird is a free, open source, cross-platform e-mail and news client developed by the Mozilla Foundation. The project strategy is modeled after Mozilla Firefox, a project aimed at creating a web browser...
are popular email clients that can be used as newsreaders. Neither provides direct yEnc support for either news or mail, but there are plugins available.
Criticisms
The creator of the yEnc encoding scheme and others have criticized the design of yEnc. It suffers from many of the same flaws as uuencode does, a number of which had already been solved years before by MIMEMIME
Multipurpose Internet Mail Extensions is an Internet standard that extends the format of email to support:* Text in character sets other than ASCII* Non-text attachments* Message bodies with multiple parts...
(which addressed the same flaws in uuencode). For example, yEnc requires the strings "=ybegin" and "=yend" to be placed around the encoded file in the message body. Although this is an improvement over uuencode's "begin" and "end", which occur more frequently in normal text, message readers can still encounter attachments where those strings are present (most frequently in discussions about yEnc itself). yEnc and uuencode also attempt to reassemble files split into multiple messages by using the subject line, which is unreliable.
Moreover, yEnc adds a few new flaws of its own. It attempts to turn unstructured fields into structured ones, which is unreliable, given that no constraints can be placed upon the unstructured use of the fields by non-yEnc uses. Most notably, the subject line of the message is supposed to contain the string "yEnc", the filename, and the part number. (The yEnc homepage chastises yEnc article posters for themselves not observing these constraints.) MIME places all such information in the message headers, which is far more reliable.
Uuencode was careful to support Internet messages as streams of text, which yEnc does not support. Software that supports yEnc encoding must know the size of the original file in advance, because the file size is specified in the yEnc header that precedes the encoded file.
Not all transports can handle the 8-bit characters employed by yEnc, which may cause data corruption. yEnc can also be mangled by different character sets. It works poorly with the increasingly popular UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...
character set, for instance. Moreover, some article transports may, on the grounds of enforcing compliance with the Internet message format standard, automatically convert any message using 8-bit characters to either Base64 or quoted-printable
Quoted-printable
Quoted-printable, or QP encoding, is an encoding using printable ASCII characters to transmit 8-bit data over a 7-bit data path or, generally, over a medium which is not 8-bit clean...
, entirely nullifying the overhead advantage.
Critics also take issue with the lack of formal standardization.
Some people, including yEnc's creator, have suggested including yEnc as part of MIME, which would solve nearly all of its problems and retain the low encoding overhead. However, , no formal or informal standard has been reached.