Postediting
Encyclopedia
Postediting “is the process of improving a machine-generated translation with a minimum of manual labour”. A person who postedits is called a posteditor. The concept of postediting is linked to that of pre-editing. In the process of translating a text via machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

, best results may be gained by pre-editing the source text - for example by applying the principles of controlled language - and then postediting the machine output. It is not linked with editing
Editing
Editing is the process of selecting and preparing written, visual, audible, and film media used to convey information through the processes of correction, condensation, organization, and other modifications performed with an intention of producing a correct, consistent, accurate, and complete...

, which refers to the process of improving human generated text (a process which in translation often known as revision). Postedited text may afterwards be revised for quality assurance
Quality Assurance
Quality assurance, or QA for short, is the systematic monitoring and evaluation of the various aspects of a project, service or facility to maximize the probability that minimum standards of quality are being attained by the production process...

.

Postediting involves the correction of machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

 output to ensure that it meets a level of quality negotiated in advance between client and posteditor. Light postediting aims at making the output simply understandable; full post-editing at making it also stylistically appropriate. With advances in machine translation full postediting is becoming an alternative to manual translation. There are a number of software tools that support post-editing of machine translated output. This includes the Google Translator Toolkit
Google translator toolkit
Google Translator Toolkit is a web service designed to allow translators to edit the translations that Google Translate automatically generates. With the Google Translator Toolkit, translators can organize their work and use shared translations, glossaries and translation memories...

, SDL Trados
SDL Trados
SDL Trados is the market leading computer assisted translation software suites, originally developed by the German company Trados GmbH and currently available from SDL International, a provider of translation management software, content management and language services...

 and Systran
SYSTRAN
SYSTRAN, founded by Dr. Peter Toma in 1968, is one of the oldest machine translation companies. SYSTRAN has done extensive work for the United States Department of Defense and the European Commission....

 .

Postediting and machine translation

Machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

 left the labs to start being used for its actual purpose in the late seventies at some big institutions such as the European Commission
European Commission
The European Commission is the executive body of the European Union. The body is responsible for proposing legislation, implementing decisions, upholding the Union's treaties and the general day-to-day running of the Union....

 and the Pan-American Health Organization, and then, later, at some corporations such as Caterpillar and General Motors
General Motors
General Motors Company , commonly known as GM, formerly incorporated as General Motors Corporation, is an American multinational automotive corporation headquartered in Detroit, Michigan and the world's second-largest automaker in 2010...

. First studies on postediting appeared in the eighties, linked to those implementations. To develop appropriate guidelines and training, members of the Association for Machine Translation in the Americas (AMTA) and the European Association for Machine Translation
European Association for Machine Translation
The European Association for Machine Translation is the European branch of the . It is a non-profit organisation and organises conferences and workshops on the subject of machine translation. It was registered in 1991 in Switzerland and is the only organisation of its type in Europe.-External...

 (EAMT) set a Post-editing Special Interest Group in 1999.

After the nineties, advances in computer power and connectivity sped machine translation development and allowed for its deployment through the web browser, including as a free, useful adjunct to the main search engines (Google Translate
Google Translate
Google Translate is a free statistical machine translation service provided by Google Inc. to translate a section of text, document or webpage, into another language.The service was introduced in April 28, 2006 for the Arabic language...

, Bing Translator, Yahoo! Babel Fish). A wider acceptance of less than perfect machine translation was accompanied also by a wider acceptance of postediting. With the demand for localisation of goods and services growing at a pace that could not be met by human translation, not even assisted by translation memory and other translation management technologies, industry bodies such as the Translation Automation Users Society (TAUS) expect machine translation and postediting to play a much bigger role within the next few years.

Light and full postediting

Studies in the eighties distinguished between degrees of postediting which, in the context of the European Commission Translation Service, were first defined as conventional and rapid
or full and rapid. Light and full postediting seems the wording most used today.

Light postediting implies minimal intervention by the posteditor, as strictly required to help the end user make some sense of the text; the expectation is that the client will use it for inbound purposes only, often when the text is needed urgently, or has a short time span.

Full postediting involves a greater level of intervention to achieve a degree of quality to be negotiated between client and posteditor; the expectation is that the outcome will be a text that is not only understandable but presented in some stylistically appropriate way, so it can be used for assimilation and even for dissemination, for inbound and for outbound purposes.

At the top end of full postediting there is the expectation of a level of quality which is undistinguishable from that of human translation. The assumption, however, has been that it takes less effort for translators to work directly from the source text than to postedit the machine generated version. With advances in machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

, this may be changing. For some language pairs and for some tasks, and with engines that have been trained with domain specific good quality data, some clients are already requesting translators to postedit instead of translating from scratch, in the belief that they will attain similar quality at a lower cost.

The light/full classification, developed in the nineties when machine translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

 still came on a CD-ROM, may not suit advances in machine translation at the light postediting end either. For some language pairs and some tasks, particularly if the source has been pre-edited, raw machine output may be good enough for gisting purposes without requiring prior human intervention.

Postediting and the language industry

After some thirty years, postediting is still “a nascent profession”. What the right profile of the posteditor is has not yet been fully studied. Postediting overlaps with translating and editing, but only partially. Most think the ideal posteditor will be a translator keen to be trained on the specific skills required, but there are some who think a bilingual without a background in translation may be easier to train. Not much is known either on who the actual posteditors are, whether they work mostly as in-house employees or freelancers, and on which conditions.

Postediting is used when raw machine translation is not good enough and manual translation not required. Industry advises postediting to be used when can at least double the productivity of manual translation, even fourfold it in the case of light postediting. These observations are likely to be based on guesses made some years back rather than on facts as they apply now.

There are not clear figures on how big the postediting pie is within the translation industry. A recent survey showed 50% of language service providers offered it, but for 85% of them it accounted less than 10% of their throughput.

Productivity and volume estimates are, in any case, moving targets since advances in machine translation, in a significant part driven by the postedited text being fed back into its engines, will mean the more we postedit, the more widespread postediting will become… until the profession – and the industry – of postediting vanishes.

See also

  • Machine translation
    Machine translation
    Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...

  • Controlled language
  • Translation memory
    Translation memory
    A translation memory, or TM, is a database that stores so-called "segments", which can be sentences or sentence-like units that have previously been translated. A translation memory system stores the words, phrases and paragraphs that have already been translated, in order to aid human translators...

  • Editing
    Editing
    Editing is the process of selecting and preparing written, visual, audible, and film media used to convey information through the processes of correction, condensation, organization, and other modifications performed with an intention of producing a correct, consistent, accurate, and complete...

  • Proofreading
    Proofreading
    Proofreading is the reading of a galley proof or computer monitor to detect and correct production-errors of text or art. Proofreaders are expected to be consistently accurate by default because they occupy the last stage of typographic production before publication.-Traditional method:A proof is...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK