Project Gutenberg
Encyclopedia
Project Gutenberg is a volunteer
effort to digitize and archive cultural works, to "encourage the creation and distribution of eBooks". Founded in 1971 by Michael S. Hart
, it is the oldest digital library
. Most of the items in its collection are the full texts of public domain
book
s. The project tries to make these as free as possible, in long-lasting, open format
s that can be used on almost any computer. , Project Gutenberg claimed over items in its collection.
Wherever possible, the releases are available in plain text
, but other formats are included, such as HTML
, PDF, EPUB
, MOBI
, and Plucker
. Most releases are in the English language
, but many non-English works are also available. There are multiple affiliated projects that are providing additional content, including regional and language-specific works. Project Gutenberg is also closely affiliated with Distributed Proofreaders
, an Internet-based community for proofreading scanned texts.
in 1971 with the digitization of the United States Declaration of Independence
. Hart, a student at the University of Illinois
, obtained access to a Xerox Sigma V mainframe computer
in the university's Materials Research Lab. Through friendly operators, he received an account with a virtually unlimited amount of computer time; its value at that time has since been variously estimated at $100,000 or $100,000,000. Hart has said he wanted to "give back" this gift by doing something that could be considered to be of great value. His initial goal was to make the 10,000 most consulted books available to the public at little or no charge, and to do so by the end of the 20th century.
This particular computer was one of the 15 nodes
on ARPANET
, the computer network that would become the Internet
. Hart believed that computers would one day be accessible to the general public and decided to make works of literature available in electronic form for free. He used a copy of the United States Declaration of Independence
in his backpack, and this became the first Project Gutenberg e-text
.
He named the project after Johannes Gutenberg, the fifteenth century German printer who propelled the movable type
printing press
revolution.
By the mid-1990s, Hart was running Project Gutenberg from Illinois Benedictine College. More volunteers had joined the effort. All of the text was entered manually up until 1989 when image scanner
s and optical character recognition
software improved and became more widely available, which made book scanning
more feasible. Hart later came to an arrangement with Carnegie Mellon University
, which agreed to administer Project Gutenberg's finances. As the volume of e-texts increased, volunteers began to take over the project's day-to-day operations that Hart had run.
Starting in 2004, an improved online catalog made Project Gutenberg content easier to browse, access and hyperlink
. Project Gutenberg is now hosted by ibiblio
at the University of North Carolina at Chapel Hill
.
Pietro Di Miceli, an Italian volunteer, developed and administered the first Project Gutenberg website and started the development of the Project online Catalog. In his ten years in this role (1994–2004), the Project web pages won a number of awards, often being featured in "best of the Web" listings, and contributing to the project's popularity.
Project Gutenberg founder, Michael Hart, died on September 6, 2011 at his home at Urbana, IL at the age of 64.
, the Project Gutenberg Literary Archive Foundation, Inc. was chartered in Mississippi
to handle the project's legal needs. Donations to it are tax-deductible
. Long-time Project Gutenberg volunteer Gregory Newby became the foundation's first CEO
.
Charles Franks also founded Distributed Proofreaders
(DP) in 2000, which allowed the proofreading of scanned texts to be distributed among many volunteers over the Internet. This effort greatly increased the number and variety of texts being added to Project Gutenberg, as well as making it easier for new volunteers to start contributing. DP became officially affiliated with Project Gutenberg in 2002. , the 10,000+ DP-contributed books comprised almost a third of the nearly books in Project Gutenberg.
. When users are unable to download the CD, they can request to have a copy sent to them, free of charge.
In December 2003, a DVD
was created containing nearly 10,000 items. At the time, this almost represented the entire collection. In early 2004, the DVD also became available by mail.
In July 2007, a new edition of the DVD was released containing over 17,000 books, and in April 2010, a dual-layer DVD was released, containing nearly 30,000 items.
The majority of the DVDs, and all of the CDs mailed by the project were recorded on recordable media by volunteers. However, the new dual layer DVDs were manufactured, as it proved more economical than having volunteers burn them. , the project has mailed approximately 40,000 discs.
The ISO images are available in Gutenberg:The CD and DVD Project.
s being added each week. These are primarily works of literature
from the Western cultural tradition
. In addition to literature such as novels, poetry, short stories and drama, Project Gutenberg also has cookbook
s, reference work
s and issues of periodicals. The Project Gutenberg collection also has a few non-text items such as audio files and music notation files.
Most releases are in English, but there are also significant numbers in many other languages. , the non-English languages most represented are: French, German, Finnish, Dutch
, Portuguese, and Chinese.
Whenever possible, Gutenberg releases are available in plain text
, mainly using US-ASCII character encoding
but frequently extended to ISO-8859-1 (needed to represent accented characters in French and Scharfes s in German, for example). Besides being copyright-free, the requirement for a Latin (character set) text version of the release has been a criterion of Michael Hart's since the founding of Project Gutenberg, as he believes this is the format most likely to be readable in the extended future. Out of necessity, this criterion has had to be extended further for the sizable collection of texts in East Asian languages such as Chinese and Japanese now in the collection, where UTF-8
is used instead.
Other formats may be released as well when submitted by volunteers. The most common non-ASCII format is HTML
, which allows markup and illustrations to be included. Some project members and users have requested more advanced formats, believing them to be much easier to read. But some formats that are not easily editable, such as PDF
, are generally not considered to fit in with the goals of Project Gutenberg, although many are being introduced to the collection in PDF
format so that illustrations can be added to downloadable documents. For years, there has been discussion of using some type of XML
, although progress on that has been slow.
Beginning in 2009 the Project Gutenberg catalog began offering
auto-generated alternate file formats, including html
, EPUB
and plucker
.
and appreciation for the literary heritage just as public libraries
began to do in the late 19th century.
Project Gutenberg is intentionally decentralized. For example, there is no selection policy dictating what texts to add. Instead, individual volunteers work on what they are interested in, or have available. The Project Gutenberg collection is intended to preserve items for the long term, so they cannot be lost by any one localized accident. In an effort to ensure this, the entire collection is backed-up regularly and mirrored on servers in many different locations.
. Material is added to the Project Gutenberg archive only after it has received a copyright clearance, and records of these clearances are saved for future reference. Unlike some other digital library projects, Project Gutenberg does not claim new copyright on titles it publishes. Instead, it encourages their free reproduction and distribution.
Most books in the Project Gutenberg collection are distributed as public domain
under U.S. copyright law. The licensing included with each ebook puts some restrictions on what can be done with the texts (such as distributing them in modified form, or for commercial purposes) as long as the Project Gutenberg trademark
is used. If the header is stripped and the trademark not used, then the public domain texts can be reused without any restrictions.
There are also a few copyrighted texts that Project Gutenberg distributes with permission. These are subject to further restrictions as specified by the copyright holder.
of the University of Pennsylvania
noted that Project Gutenberg is responsive about addressing errors once they are identified, and the texts now include specific source edition citations. In many cases the editions also are not the most current scholarly editions, for these later editions are not usually in the public domain.
The text is wrapped at 65-70 characters and paragraphs are separated by a double-line break. Although this makes the release available to anybody with a text-reader, a drawback of this format is the lack of markup and the resulting relatively bland appearance.
While the works in Project Gutenberg represent a valuable sample of publications that span several centuries, there are some issues of concern for linguistic analysis. Some content may have been modified by the transcriber because of editorial changes or corrections (such as to correct for obvious proof-setting or printing errors). The spelling may also have been modified to conform with current practices (although the intent by Project Gutenberg, and by Distributed Proofreaders
, is to preserve the original text and where possible the formatting). This can mean that the works may be problematic when searching for older grammatical usage. Finally, the collected works can be weighted heavily towards certain authors (such as Charles Dickens
), while others are barely represented.
In March 2004, a new initiative was begun by Michael Hart and John S. Guagliardo to provide low-cost intellectual properties. The initial name for this project was Project Gutenberg 2 (PG II), which created controversy among PG volunteers because of the re-use of the project's trademarked name for a commercial venture.
Virtual volunteering
Virtual volunteering is a term describing a volunteer who completes tasks, in whole or in part, off-site from the organization being assisted, using the Internet and a home, school, telecenter or work computer or other Internet-connected device. Virtual volunteering is also known as online...
effort to digitize and archive cultural works, to "encourage the creation and distribution of eBooks". Founded in 1971 by Michael S. Hart
Michael S. Hart
Michael Stern Hart was an American author, best known as the inventor of the electronic book and the founder of Project Gutenberg, a project to make ebooks freely available via the Internet...
, it is the oldest digital library
Digital library
A digital library is a library in which collections are stored in digital formats and accessible by computers. The digital content may be stored locally, or accessed remotely via computer networks...
. Most of the items in its collection are the full texts of public domain
Public domain
Works are in the public domain if the intellectual property rights have expired, if the intellectual property rights are forfeited, or if they are not covered by intellectual property rights at all...
book
Book
A book is a set or collection of written, printed, illustrated, or blank sheets, made of hot lava, paper, parchment, or other materials, usually fastened together to hinge at one side. A single sheet within a book is called a leaf or leaflet, and each side of a leaf is called a page...
s. The project tries to make these as free as possible, in long-lasting, open format
Open format
An open file format is a published specification for storing digital data, usually maintained by a standards organization, which can therefore be used and implemented by anyone. For example, an open format can be implementable by both proprietary and free and open source software, using the typical...
s that can be used on almost any computer. , Project Gutenberg claimed over items in its collection.
Wherever possible, the releases are available in plain text
Text file
A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists within a computer file system...
, but other formats are included, such as HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, PDF, EPUB
EPUB
EPUB is a free and open e-book standard by the International Digital Publishing Forum...
, MOBI
Mobipocket
Mobipocket SA is a French company incorporated in March 2000 which produces Mobipocket Reader software, an E-Book reader for some PDAs, phones and desktop operating systems....
, and Plucker
Plucker
Plucker is an offline Web and free e-book reader for Palm OS based handheld devices, Windows Mobile devices and other PDAs. Plucker contains POSIX tools, scripts and "conduits" which work on Unix, Linux, Mac OS X, and Microsoft Windows...
. Most releases are in the English language
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...
, but many non-English works are also available. There are multiple affiliated projects that are providing additional content, including regional and language-specific works. Project Gutenberg is also closely affiliated with Distributed Proofreaders
Distributed Proofreaders
Distributed Proofreaders is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors.- History :...
, an Internet-based community for proofreading scanned texts.
History
Project Gutenberg was started by Michael HartMichael S. Hart
Michael Stern Hart was an American author, best known as the inventor of the electronic book and the founder of Project Gutenberg, a project to make ebooks freely available via the Internet...
in 1971 with the digitization of the United States Declaration of Independence
United States Declaration of Independence
The Declaration of Independence was a statement adopted by the Continental Congress on July 4, 1776, which announced that the thirteen American colonies then at war with Great Britain regarded themselves as independent states, and no longer a part of the British Empire. John Adams put forth a...
. Hart, a student at the University of Illinois
University of Illinois at Urbana-Champaign
The University of Illinois at Urbana–Champaign is a large public research-intensive university in the state of Illinois, United States. It is the flagship campus of the University of Illinois system...
, obtained access to a Xerox Sigma V mainframe computer
Mainframe computer
Mainframes are powerful computers used primarily by corporate and governmental organizations for critical applications, bulk data processing such as census, industry and consumer statistics, enterprise resource planning, and financial transaction processing.The term originally referred to the...
in the university's Materials Research Lab. Through friendly operators, he received an account with a virtually unlimited amount of computer time; its value at that time has since been variously estimated at $100,000 or $100,000,000. Hart has said he wanted to "give back" this gift by doing something that could be considered to be of great value. His initial goal was to make the 10,000 most consulted books available to the public at little or no charge, and to do so by the end of the 20th century.
This particular computer was one of the 15 nodes
Node (networking)
In communication networks, a node is a connection point, either a redistribution point or a communication endpoint . The definition of a node depends on the network and protocol layer referred to...
on ARPANET
ARPANET
The Advanced Research Projects Agency Network , was the world's first operational packet switching network and the core network of a set that came to compose the global Internet...
, the computer network that would become the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
. Hart believed that computers would one day be accessible to the general public and decided to make works of literature available in electronic form for free. He used a copy of the United States Declaration of Independence
United States Declaration of Independence
The Declaration of Independence was a statement adopted by the Continental Congress on July 4, 1776, which announced that the thirteen American colonies then at war with Great Britain regarded themselves as independent states, and no longer a part of the British Empire. John Adams put forth a...
in his backpack, and this became the first Project Gutenberg e-text
E-text
An e-text is, generally, any text-based information that is available in a digitally encoded human-readable format and read by electronic means, but more specifically it refers to files in the ASCII character encoding.E-text has the broad meaning of something electronic that represents words, a...
.
He named the project after Johannes Gutenberg, the fifteenth century German printer who propelled the movable type
Movable type
Movable type is the system of printing and typography that uses movable components to reproduce the elements of a document ....
printing press
Printing press
A printing press is a device for applying pressure to an inked surface resting upon a print medium , thereby transferring the ink...
revolution.
By the mid-1990s, Hart was running Project Gutenberg from Illinois Benedictine College. More volunteers had joined the effort. All of the text was entered manually up until 1989 when image scanner
Image scanner
In computing, an image scanner—often abbreviated to just scanner—is a device that optically scans images, printed text, handwriting, or an object, and converts it to a digital image. Common examples found in offices are variations of the desktop scanner where the document is placed on a glass...
s and optical character recognition
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...
software improved and became more widely available, which made book scanning
Book scanning
Book scanning is the process of converting physical books and magazines into digital media such as images, electronic text, or electronic books by using an image scanner....
more feasible. Hart later came to an arrangement with Carnegie Mellon University
Carnegie Mellon University
Carnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....
, which agreed to administer Project Gutenberg's finances. As the volume of e-texts increased, volunteers began to take over the project's day-to-day operations that Hart had run.
Starting in 2004, an improved online catalog made Project Gutenberg content easier to browse, access and hyperlink
Hyperlink
In computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks...
. Project Gutenberg is now hosted by ibiblio
Ibiblio
ibiblio is a "collection of collections," and hosts a diverse range of publicly available information and open source software, including software, music, literature, art, history, science, politics, and cultural studies. As an "Internet librarianship," ibiblio is a digital library and archive...
at the University of North Carolina at Chapel Hill
University of North Carolina at Chapel Hill
The University of North Carolina at Chapel Hill is a public research university located in Chapel Hill, North Carolina, United States...
.
Pietro Di Miceli, an Italian volunteer, developed and administered the first Project Gutenberg website and started the development of the Project online Catalog. In his ten years in this role (1994–2004), the Project web pages won a number of awards, often being featured in "best of the Web" listings, and contributing to the project's popularity.
Project Gutenberg founder, Michael Hart, died on September 6, 2011 at his home at Urbana, IL at the age of 64.
Affiliated organizations
In 2000, a non-profit corporationNon-profit organization
Nonprofit organization is neither a legal nor technical definition but generally refers to an organization that uses surplus revenues to achieve its goals, rather than distributing them as profit or dividends...
, the Project Gutenberg Literary Archive Foundation, Inc. was chartered in Mississippi
Mississippi
Mississippi is a U.S. state located in the Southern United States. Jackson is the state capital and largest city. The name of the state derives from the Mississippi River, which flows along its western boundary, whose name comes from the Ojibwe word misi-ziibi...
to handle the project's legal needs. Donations to it are tax-deductible
Tax deduction
Income tax systems generally allow a tax deduction, i.e., a reduction of the income subject to tax, for various items, especially expenses incurred to produce income. Often these deductions are subject to limitations or conditions...
. Long-time Project Gutenberg volunteer Gregory Newby became the foundation's first CEO
Chief executive officer
A chief executive officer , managing director , Executive Director for non-profit organizations, or chief executive is the highest-ranking corporate officer or administrator in charge of total management of an organization...
.
Charles Franks also founded Distributed Proofreaders
Distributed Proofreaders
Distributed Proofreaders is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors.- History :...
(DP) in 2000, which allowed the proofreading of scanned texts to be distributed among many volunteers over the Internet. This effort greatly increased the number and variety of texts being added to Project Gutenberg, as well as making it easier for new volunteers to start contributing. DP became officially affiliated with Project Gutenberg in 2002. , the 10,000+ DP-contributed books comprised almost a third of the nearly books in Project Gutenberg.
CD and DVD Project
In August 2003, Project Gutenberg created a CD containing approximately 600 of the "best" e-books from the collection. The CD is available for download as an ISO imageISO image
An ISO image is an archive file of an optical disc, composed of the data contents of every written sector of an optical disc, including the optical disc file system...
. When users are unable to download the CD, they can request to have a copy sent to them, free of charge.
In December 2003, a DVD
DVD
A DVD is an optical disc storage media format, invented and developed by Philips, Sony, Toshiba, and Panasonic in 1995. DVDs offer higher storage capacity than Compact Discs while having the same dimensions....
was created containing nearly 10,000 items. At the time, this almost represented the entire collection. In early 2004, the DVD also became available by mail.
In July 2007, a new edition of the DVD was released containing over 17,000 books, and in April 2010, a dual-layer DVD was released, containing nearly 30,000 items.
The majority of the DVDs, and all of the CDs mailed by the project were recorded on recordable media by volunteers. However, the new dual layer DVDs were manufactured, as it proved more economical than having volunteers burn them. , the project has mailed approximately 40,000 discs.
The ISO images are available in Gutenberg:The CD and DVD Project.
Scope of collection
, Project Gutenberg claimed over items in its collection, with an average of over fifty new e-bookE-book
An electronic book is a book-length publication in digital form, consisting of text, images, or both, and produced on, published through, and readable on computers or other electronic devices. Sometimes the equivalent of a conventional printed book, e-books can also be born digital...
s being added each week. These are primarily works of literature
Literature
Literature is the art of written works, and is not bound to published sources...
from the Western cultural tradition
Western culture
Western culture, sometimes equated with Western civilization or European civilization, refers to cultures of European origin and is used very broadly to refer to a heritage of social norms, ethical values, traditional customs, religious beliefs, political systems, and specific artifacts and...
. In addition to literature such as novels, poetry, short stories and drama, Project Gutenberg also has cookbook
Cookbook
A cookbook is a kitchen reference that typically contains a collection of recipes. Modern versions may also include colorful illustrations and advice on purchasing quality ingredients or making substitutions...
s, reference work
Reference work
A reference work is a compendium of information, usually of a specific type, compiled in a book for ease of reference. That is, the information is intended to be quickly found when needed. Reference works are usually referred to for particular pieces of information, rather than read beginning to end...
s and issues of periodicals. The Project Gutenberg collection also has a few non-text items such as audio files and music notation files.
Most releases are in English, but there are also significant numbers in many other languages. , the non-English languages most represented are: French, German, Finnish, Dutch
Dutch language
Dutch is a West Germanic language and the native language of the majority of the population of the Netherlands, Belgium, and Suriname, the three member states of the Dutch Language Union. Most speakers live in the European Union, where it is a first language for about 23 million and a second...
, Portuguese, and Chinese.
Whenever possible, Gutenberg releases are available in plain text
Text file
A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists within a computer file system...
, mainly using US-ASCII character encoding
Character encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
but frequently extended to ISO-8859-1 (needed to represent accented characters in French and Scharfes s in German, for example). Besides being copyright-free, the requirement for a Latin (character set) text version of the release has been a criterion of Michael Hart's since the founding of Project Gutenberg, as he believes this is the format most likely to be readable in the extended future. Out of necessity, this criterion has had to be extended further for the sizable collection of texts in East Asian languages such as Chinese and Japanese now in the collection, where UTF-8
UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...
is used instead.
Other formats may be released as well when submitted by volunteers. The most common non-ASCII format is HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, which allows markup and illustrations to be included. Some project members and users have requested more advanced formats, believing them to be much easier to read. But some formats that are not easily editable, such as PDF
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
, are generally not considered to fit in with the goals of Project Gutenberg, although many are being introduced to the collection in PDF
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
format so that illustrations can be added to downloadable documents. For years, there has been discussion of using some type of XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
, although progress on that has been slow.
Beginning in 2009 the Project Gutenberg catalog began offering
auto-generated alternate file formats, including html
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, EPUB
EPUB
EPUB is a free and open e-book standard by the International Digital Publishing Forum...
and plucker
Plucker
Plucker is an offline Web and free e-book reader for Palm OS based handheld devices, Windows Mobile devices and other PDAs. Plucker contains POSIX tools, scripts and "conduits" which work on Unix, Linux, Mac OS X, and Microsoft Windows...
.
Ideals
Michael Hart said in 2004, "The mission of Project Gutenberg is simple: 'To encourage the creation and distribution of ebooks'". His goal was, "to provide as many e-books in as many formats as possible for the entire world to read in as many languages as possible". Likewise, a project slogan is to "break down the bars of ignorance and illiteracy", because its volunteers aim to continue spreading public literacyLiteracy
Literacy has traditionally been described as the ability to read for knowledge, write coherently and think critically about printed material.Literacy represents the lifelong, intellectual process of gaining meaning from print...
and appreciation for the literary heritage just as public libraries
Public library
A public library is a library that is accessible by the public and is generally funded from public sources and operated by civil servants. There are five fundamental characteristics shared by public libraries...
began to do in the late 19th century.
Project Gutenberg is intentionally decentralized. For example, there is no selection policy dictating what texts to add. Instead, individual volunteers work on what they are interested in, or have available. The Project Gutenberg collection is intended to preserve items for the long term, so they cannot be lost by any one localized accident. In an effort to ensure this, the entire collection is backed-up regularly and mirrored on servers in many different locations.
Copyright
Project Gutenberg is careful to verify the status of its ebooks according to U.S. copyright lawUnited States copyright law
The copyright law of the United States governs the legally enforceable rights of creative and artistic works under the laws of the United States.Copyright law in the United States is part of federal law, and is authorized by the U.S. Constitution...
. Material is added to the Project Gutenberg archive only after it has received a copyright clearance, and records of these clearances are saved for future reference. Unlike some other digital library projects, Project Gutenberg does not claim new copyright on titles it publishes. Instead, it encourages their free reproduction and distribution.
Most books in the Project Gutenberg collection are distributed as public domain
Public domain
Works are in the public domain if the intellectual property rights have expired, if the intellectual property rights are forfeited, or if they are not covered by intellectual property rights at all...
under U.S. copyright law. The licensing included with each ebook puts some restrictions on what can be done with the texts (such as distributing them in modified form, or for commercial purposes) as long as the Project Gutenberg trademark
Trademark
A trademark, trade mark, or trade-mark is a distinctive sign or indicator used by an individual, business organization, or other legal entity to identify that the products or services to consumers with which the trademark appears originate from a unique source, and to distinguish its products or...
is used. If the header is stripped and the trademark not used, then the public domain texts can be reused without any restrictions.
There are also a few copyrighted texts that Project Gutenberg distributes with permission. These are subject to further restrictions as specified by the copyright holder.
Criticism
Some people have criticized Project Gutenberg for lack of scholarly rigor in its e-texts: for example, there is usually inadequate information about the edition used and often omission of original prefaces. However, John Mark OckerbloomJohn Mark Ockerbloom
John Mark Ockerbloom is a pioneer in library science. He was the first person to make a substantial effort to catalog online books in a rigorous and comprehensive manner. Formerly at Carnegie Mellon University, from which he earned a PhD in computer science, he is now a "digital library...
of the University of Pennsylvania
University of Pennsylvania
The University of Pennsylvania is a private, Ivy League university located in Philadelphia, Pennsylvania, United States. Penn is the fourth-oldest institution of higher education in the United States,Penn is the fourth-oldest using the founding dates claimed by each institution...
noted that Project Gutenberg is responsive about addressing errors once they are identified, and the texts now include specific source edition citations. In many cases the editions also are not the most current scholarly editions, for these later editions are not usually in the public domain.
The text is wrapped at 65-70 characters and paragraphs are separated by a double-line break. Although this makes the release available to anybody with a text-reader, a drawback of this format is the lack of markup and the resulting relatively bland appearance.
While the works in Project Gutenberg represent a valuable sample of publications that span several centuries, there are some issues of concern for linguistic analysis. Some content may have been modified by the transcriber because of editorial changes or corrections (such as to correct for obvious proof-setting or printing errors). The spelling may also have been modified to conform with current practices (although the intent by Project Gutenberg, and by Distributed Proofreaders
Distributed Proofreaders
Distributed Proofreaders is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors.- History :...
, is to preserve the original text and where possible the formatting). This can mean that the works may be problematic when searching for older grammatical usage. Finally, the collected works can be weighted heavily towards certain authors (such as Charles Dickens
Charles Dickens
Charles John Huffam Dickens was an English novelist, generally considered the greatest of the Victorian period. Dickens enjoyed a wider popularity and fame than had any previous author during his lifetime, and he remains popular, having been responsible for some of English literature's most iconic...
), while others are barely represented.
In March 2004, a new initiative was begun by Michael Hart and John S. Guagliardo to provide low-cost intellectual properties. The initial name for this project was Project Gutenberg 2 (PG II), which created controversy among PG volunteers because of the re-use of the project's trademarked name for a commercial venture.
Affiliated projects
All affiliated projects are independent organizations which share the same ideals, and have been given permission to use the Project Gutenberg trademark. They often have a particular national, or linguistic focus.List of affiliated projects
- Project Gutenberg AustraliaProject Gutenberg AustraliaProject Gutenberg Australia, abbreviated as PGA, is an Internet site which was founded in 2001 by Colin Choat. The site hosts free ebooks or e-texts which are in the public domain in Australia. The ebooks have been prepared and submitted by volunteers...
hosts many texts which are public domain according to Australian copyright lawAustralian copyright lawThe copyright law of Australia defines the legally enforceable rights of creators of creative and artistic works under Australian law. The scope of copyright in Australia is defined in the Australian Copyright Act 1968 , which applies the national law throughout Australia...
, but still under copyright (or of uncertain status) in the United States, with a focus on Australian writers and books about Australia. - Projekt Gutenberg-DEProjekt Gutenberg-DEProjekt Gutenberg-DE is a collection of German language literary texts, distributed via the web and on CD-ROM. It is run by a small publishing company called Hille Partner, run by Gunter Hille, and its web presence is hosted by the weekly magazine Der Spiegel....
claims copyright for its product and limits access to browsable web-versions of its texts. - Project Gutenberg Consortia Center is an affiliate specializing in collections of collections. These do not have the editorial oversight or consistent formatting of the main Project Gutenberg. Thematic collections, as well as numerous languages, are featured.
- PG-EU is a sister project which operates under the copyright law of the European UnionEuropean UnionThe European Union is an economic and political union of 27 independent member states which are located primarily in Europe. The EU traces its origins from the European Coal and Steel Community and the European Economic Community , formed by six countries in 1958...
. One of its aims is to include as many languages as possible into Project Gutenberg. It operates in UnicodeUnicodeUnicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
to ensure that all alphabets can be represented easily and correctly. - Project Gutenberg of the Philippines aims to "make as many books available to as many people as possible, with a special focus on the Philippines and Philippine languages".
- Project Gutenberg of Taiwan seeks to archive copyright free books with a special focus on Taiwan in English, Mandarin and Taiwan based languages. It is a special project of ForumosaForumosaForumosa.com is a discussion forum started in 1998 as ORIENTED.org. The community of regular visitors grew as it changed names twice .It is primarily a phpBB-based discussion forum...
.com. - Project Gutenberg Europe is a project run by Project RastkoProject RastkoProject Rastko — Internet Library of Serb Culture is a non-profit and non-governmental publishing, cultural and educational project dedicated to Serb and Serb-related arts and humanities...
in SerbiaSerbiaSerbia , officially the Republic of Serbia , is a landlocked country located at the crossroads of Central and Southeast Europe, covering the southern part of the Carpathian basin and the central part of the Balkans...
. It aims at being a Project Gutenberg for all of Europe, and has started to post its first projects in 2005. It is running the Distributed ProofreadersDistributed ProofreadersDistributed Proofreaders is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors.- History :...
software to quickly produce etexts. - Project Gutenberg Luxembourg publishes mostly, but not exclusively, books that are written in LuxembourgishLuxembourgish languageLuxembourgish is a High German language spoken mainly in Luxembourg. About 320,000 people worldwide speak Luxembourgish.-Language family:...
. - Projekti Lönnrot is a project started by Finnish Project Gutenberg volunteers which derives its name from Elias LönnrotElias LönnrotElias Lönnrot was a Finnish philologist and collector of traditional Finnish oral poetry. He is best known for compiling the Kalevala, the Finnish national epic compiled from national folklore.-Education and early life:...
, who was a Finnish philologist. - Project Gutenberg CanadaProject Gutenberg CanadaProject Gutenberg Canada began on Canada Day 2007. Canadian citizens can create e-texts and download many books that are not yet in the Public Domain of many other countries. Some authors whose complete works can now be made available are A. A...
.
See also
- Google Books
- Internet ArchiveInternet ArchiveThe Internet Archive is a non-profit digital library with the stated mission of "universal access to all knowledge". It offers permanent storage and access to collections of digitized materials, including websites, music, moving images, and nearly 3 million public domain books. The Internet Archive...
- Open Content AllianceOpen Content AllianceThe Open Content Alliance is a consortium of organizations contributing to a permanent, publicly accessible archive of digitized texts. Its creation was announced in October 2005 by Yahoo!, the Internet Archive, the University of California, the University of Toronto and others...
- Project RunebergProject RunebergProject Runeberg is an initiative patterned after Project Gutenberg that publishes freely available electronic versions of books significant to the culture and history of the Nordic countries...
, for books significant to the culture and history of the Nordic countries. - WikisourceWikisourceWikisource is an online digital library of free content textual sources on a wiki, operated by the Wikimedia Foundation. Its aims are to host all forms of free text, in many languages, and translations. Originally conceived as an archive to store useful or important historical texts, it has...
or Project Sourceberg - RuniversRuniversRunivers is a site devoted to Russian culture and history. Runivers targets Russian speaking readers and those interested in Russian culture and history....
- List of digital library projects
- LibriVoxLibriVoxLibriVox is an online digital library of free public domain audiobooks, read by volunteers and is probably, since 2007, the world's most prolific audiobook publisher...
free online audio library, with many texts used from Project Gutenberg - Aozora BunkoAozora BunkoAozora Bunko is a Japanese digital library. This online collection encompasses several thousands of works of Japanese-language fiction and non-fiction. These include out-of-copyright books or works that the authors wish to make freely available....
- Chinese Text ProjectChinese Text ProjectThe Chinese Text Project is a digital library project that assembles collections of early Chinese texts. The name of the project in Chinese literally means "The Digitization Project of Chinese Philosophy Books", showing its focus on books related to Chinese philosophy...
- Digital narratologyDigital narratologyDigital Narratology is the study of narratology in the new media forms that have arisen with the advent of computer technology. This area of study looks at hypertext novels such as Victory Garden by Stuart Moulthrop, network fiction such as 253 by Geoff Ryman Text based interactive fiction The...
- Virtual VolunteeringVirtual volunteeringVirtual volunteering is a term describing a volunteer who completes tasks, in whole or in part, off-site from the organization being assisted, using the Internet and a home, school, telecenter or work computer or other Internet-connected device. Virtual volunteering is also known as online...
External links
- Project Gutenberg
- The CD and DVD Project - Download the books
- Distributed Proofreaders — a worldwide group of volunteer editors that is now the main source of eBooks for Project Gutenberg (note that many of these have been renamed to Project Gutenberg for trademark concerns, and are not original with the Project)
- Project Gutenberg News — Official News for Gutenberg.org. Includes the Newsletter Archives, 1989–Present.