Virtaal
Encyclopedia
Virtaal is a computer-assisted translation
tool written in the Python programming language
. It is free software
developed and maintained by Translate.org.za.
Virtaal is built using the Translate Toolkit
allowing it to process a number of translation and localisation
formats.
localisers. Version 0.2, released in October 2008, became the first official release.
, an official language of South Africa
where Translate.org.za is located, the expression "vir taal" means "for language", while the word "vertaal" means "translate".
. This would include XLIFF
, Gettext PO and MO, various Qt files (.qm, .ts, .qph), Wordfast
translation memory, TBX, TMX
and OmegaT glossaries.
Computer-assisted translation
Computer-assisted translation, computer-aided translation, or CAT is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process....
tool written in the Python programming language
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
. It is free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...
developed and maintained by Translate.org.za.
Virtaal is built using the Translate Toolkit
Translate Toolkit
The Translate Toolkit is a localization and translation toolkit. It provides a set of tools for working with localization file formats and files that might need localization. The toolkit also provides an API on which to develop other localization tools....
allowing it to process a number of translation and localisation
Internationalization and localization
In computing, internationalization and localization are means of adapting computer software to different languages, regional differences and technical requirements of a target market...
formats.
Design Philosophy
The key principle behind the design of Virtaal is the optimisation of the interface for the localiser. This includes ensuring that all relevant functionality is keyboard accessible and that needed information is always optimally displayed.History
Work on Virtaal began in 2007 with an initial 0.1 release made to a small number of open sourceOpen source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
localisers. Version 0.2, released in October 2008, became the first official release.
Name
The name Virtaal, pronounced fərˈtɑːl, is a play on words. In AfrikaansAfrikaans
Afrikaans is a West Germanic language, spoken natively in South Africa and Namibia. It is a daughter language of Dutch, originating in its 17th century dialects, collectively referred to as Cape Dutch .Afrikaans is a daughter language of Dutch; see , , , , , .Afrikaans was historically called Cape...
, an official language of South Africa
South Africa
The Republic of South Africa is a country in southern Africa. Located at the southern tip of Africa, it is divided into nine provinces, with of coastline on the Atlantic and Indian oceans...
where Translate.org.za is located, the expression "vir taal" means "for language", while the word "vertaal" means "translate".
Supported source document formats
Virtaal works directly with any of the bilingual (containing both source and target language) files understood by the Translate ToolkitTranslate Toolkit
The Translate Toolkit is a localization and translation toolkit. It provides a set of tools for working with localization file formats and files that might need localization. The toolkit also provides an API on which to develop other localization tools....
. This would include XLIFF
XLIFF
XLIFF is an XML-based format created to standardize localization. XLIFF was standardized by OASIS in 2002. Its current specification is v1.2 released on Feb-1-2008....
, Gettext PO and MO, various Qt files (.qm, .ts, .qph), Wordfast
Wordfast
Wordfast is a provider of translation memory software. Wordfast provides platform-independent TM solutions for freelance translators, language service providers, and multi-national corporations.- History :...
translation memory, TBX, TMX
Translation Memory eXchange
TMX is an open XML standard for the exchange of translation memory data created by computer-aided translation and localization tools....
and OmegaT glossaries.
Features
- Simple single view interfaceGraphical user interfaceIn computing, a graphical user interface is a type of user interface that allows users to interact with electronic devices with images rather than text commands. GUIs can be used in computers, hand-held devices such as MP3 players, portable media players or gaming devices, household appliances and...
- Colour highlightingSyntax highlightingSyntax highlighting is a feature of some text editors that display text—especially source code—in different colors and fonts according to the category of terms. This feature eases writing in a structured language such as a programming language or a markup language as both structures and...
- Autocorrect
- AutocompleteAutocompleteAutocomplete is a feature provided by many web browsers, e-mail programs, search engine interfaces, source code editors, database query tools, word processors, and command line interpreters. Autocomplete involves the program predicting a word or phrase that the user wants to type in without the...
- In-context segment filtering:
- All segments
- Partial translations and non-translated segments
- All segments matching a search string (includes case-sensitivity and Python regular expressions)
- Search and replace with regular expressionsRegular expressionIn computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
and Unicode normalisationUnicode equivalenceUnicode equivalence is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same character... - Translation memory with several back-ends:
- Local translation memory database (including current file)
- Remote translation memory database (such as an office TM server)
- Open-Tran.eu
- Machine translationMachine translationMachine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...
through ApertiumApertiumApertium is a rule-based machine translation platform. It is free software and released under the terms of the GNU General Public License.-History:...
, Google TranslateGoogle TranslateGoogle Translate is a free statistical machine translation service provided by Google Inc. to translate a section of text, document or webpage, into another language.The service was introduced in April 28, 2006 for the Arabic language...
, Microsoft Translator, MosesMoses (machine translation)Moses is a free software statistical machine translation engine that allows automatically training translation models for any language pair given a collection of source and target text pairs...
or the libtranslate library providing access to several others - TinyTM
- Terminology help from:
- Automatically downloaded files
- Local terminology files
- Open-Tran.eu
- Recognition and easy insertion of placeables
- Language identificationLanguage identificationLanguage identification is the process of determining which natural language given content is in. Traditionally, identification of written language - as practiced, for instance, in library science - has relied on manually identifying frequent words and letters known to be characteristic of...
- Quality checks