NooJ
Encyclopedia
NooJ is a development environment used to construct large-coverage, formalized descriptions of natural languages and to apply them to large corpora in real time.
", along with the French DELAC-DELACF dictionaries of compound words as part of his Ph.D. research from 1986 to 1989 at the LADL (University of Paris
7-CNRS) under the supervision of Prof. Maurice Gross.
From 1993 to 2002, he developed a software application called INTEX, which was used at
the LADL and at various affiliated laboratories to build DELA
dictionaries and
perform automatic lexical analysis on texts. See http://intex.univ-fcomte.fr
for more details on INTEX.
Since 2002, he has been working on NooJ.
, lexical and productive morphology
, local, structural and transformational syntax
). It integrates a broad spectrum of computational technology – from finite-state automata to augmented
/recursive transition network
s.
Included tools can construct, test, debug, maintain and accumulate large sets of linguistic resources, and can describe:
NooJ can also be used as a corpus
-processing system, making it possible to process sets of (thousands of) text files in many ways, including:
Modules for several languages are currently available for free download: Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian, English, French, German, Hebrew, Hungarian, Italian, Polish, Portuguese and Spanish. Several other modules are under development.
NooJ's most unique characteristics are:
NooJ can be used as a linguistic-engineering development platform, a corpus processor, an information-extraction
system, a terminology extractor
, a machine-translation
development tool, as well as to teach Linguistics
and Computational Linguistics
.
approach for building NooJ. Although originally, he used Java/J2EE framework
, he then switched to C#/.NET framework
thus giving NooJ a number of additional capabilities including the automatic management of hundreds of text encodings and formats, native XML compatibility (both for parsing XML documents and storing objects (XML/SOAP)); the ASP.NET library allows NooJ to be easily transformed into a WEB server application; .NET Services and Remoting technology allows NooJ’s functionality to be available as independent agents that run in parallel, etc.
The MONO and the DOTGNU projects aim at building a .NET computing environment (i.e. virtual machine) for LINUX
, FreeBSD
, Mac OSX
as well as several variants of UNIX
. So far, noojapply.exe on MONO
have been successfully tested, but NooJ.exe does not run yet on MONO. For more information, see: http://www.mono-project.com and http://www.dotgnu.org
Minimum requirements for a computer to run NooJ on small texts (less than one Mega byte) are not very high: 512 Mb of RAM, 1 GB available on the hard drive.
If you plan to use NooJ to parse large corpora (hundreds or thousands of text files), or to compile large-coverage dictionaries (tens of thousands of entries or more), the minimum configuration should be higher: PC with Pentium 4 or equivalent, 2 GB RAM or more.
If you are planning to use NooJ to develop large sets of local grammars (hundreds of graphs), a good screen is necessary: at least a 19 inch screen, with a 1600x1024 16-bit resolution, and a minimum of 80 Hz refresh rate.
noojapply.exe can be called either directly from a “SHELL” script, or from more sophisticated programs written in Perl, C++, Java, etc.
noojapply.exe allows users to apply to texts and corpora dictionaries and grammars automatically.
If you are planning to use NooJ’s functionalities in a professional environment (e.g. build a linguistic research engine), note that they are also available via:
Most laboratories and academic centers use NooJ as a research or educational tool: some users are interested in its Corpus processing functionalities (analysis of literary text, research and extract information from newspapers or technical corpora, etc.); others use NooJ to formalize certain linguistic phenomena (e.g. describe a language’s morphology), others for computational applications (automatic text analysis), etc.
Among NooJ users, some are actively helping the NooJ project, by giving away some of their linguistic resources, projects or demos, labs, tutorials or documentations. These users, who constitute “NooJ’s community”, should be considered as NooJ’s “co-authors”. The Community Edition of the NooJ application (which is also free), is an extended version of NooJ, that gives full access to its internal functionalities as well as privileged access to sources of its linguistic resources.
NooJ users meet once a year at the NooJ conference. NooJ tutorials and workshops are regularly organized during the year.
Author
NooJ is under continuous development and is updated daily by Professor Max Silberztein.History
Professor Max Silberztein constructed his first package of "Finite State tools for Natural Language ProcessingNatural language processing
Natural language processing is a field of computer science and linguistics concerned with the interactions between computers and human languages; it began as a branch of artificial intelligence....
", along with the French DELAC-DELACF dictionaries of compound words as part of his Ph.D. research from 1986 to 1989 at the LADL (University of Paris
University of Paris
The University of Paris was a university located in Paris, France and one of the earliest to be established in Europe. It was founded in the mid 12th century, and officially recognized as a university probably between 1160 and 1250...
7-CNRS) under the supervision of Prof. Maurice Gross.
From 1993 to 2002, he developed a software application called INTEX, which was used at
the LADL and at various affiliated laboratories to build DELA
Dela
Dela , count of Empúries , was the son of Sunyer I of Empúries, whom he succeeded along with his brother, Sunyer II of Empúries, in 862....
dictionaries and
perform automatic lexical analysis on texts. See http://intex.univ-fcomte.fr
for more details on INTEX.
Since 2002, he has been working on NooJ.
Description
NooJ™ is a freeware, linguistic-engineering development environment for formalizing various types of textual phenomena (orthographyOrthography
The orthography of a language specifies a standardized way of using a specific writing system to write the language. Where more than one writing system is used for a language, for example Kurdish, Uyghur, Serbian or Inuktitut, there can be more than one orthography...
, lexical and productive morphology
Morphology (linguistics)
In linguistics, morphology is the identification, analysis and description, in a language, of the structure of morphemes and other linguistic units, such as words, affixes, parts of speech, intonation/stress, or implied context...
, local, structural and transformational syntax
Syntax
In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....
). It integrates a broad spectrum of computational technology – from finite-state automata to augmented
Augmented transition network
An augmented transition network is a type of graph theoretic structure used in the operational definition of formal languages, used especially in parsing relatively complex natural languages, and having wide application in artificial intelligence...
/recursive transition network
Recursive transition network
A recursive transition network is a graph theoretical schematic used to represent the rules of a context free grammar. RTNs have application to programming languages, natural language and lexical analysis...
s.
Included tools can construct, test, debug, maintain and accumulate large sets of linguistic resources, and can describe:
- InflectionInflectionIn grammar, inflection or inflexion is the modification of a word to express different grammatical categories such as tense, grammatical mood, grammatical voice, aspect, person, number, gender and case...
al and derivationDerivation (linguistics)In linguistics, derivation is the process of forming a new word on the basis of an existing word, e.g. happi-ness and un-happy from happy, or determination from determine...
al morphology, - Variations in spelling and terminologyTerminologyTerminology is the study of terms and their use. Terms are words and compound words that in specific contexts are given specific meanings, meanings that may deviate from the meaning the same words have in other contexts and in everyday language. The discipline Terminology studies among other...
, - VocabulariesVocabularyA person's vocabulary is the set of words within a language that are familiar to that person. A vocabulary usually develops with age, and serves as a useful and fundamental tool for communication and acquiring knowledge...
(simple words, multi-word units and fixed expressionFixed expressionA fixed expression in English is a standard form of expression that has taken on a more specific meaning than the expression itself. It is different from a proverb in that it is used as a part of a sentence, and is the standard way of expressing a concept or idea.Examples include:* all of a sudden*...
s), - Semi-fixed phenomena (local grammars),
- SyntaxSyntaxIn linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....
(grammars for phrases and full sentences) and - SemanticsSemanticsSemantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
(named-entity recognitionNamed entity recognitionNamed-entity recognition is a subtask of information extraction that seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.Most research on NER...
and transformational analysisTransformational grammarIn linguistics, a transformational grammar or transformational-generative grammar is a generative grammar, especially of a natural language, that has been developed in the Chomskyan tradition of phrase structure grammars...
).
NooJ can also be used as a corpus
Text corpus
In linguistics, a corpus or text corpus is a large and structured set of texts...
-processing system, making it possible to process sets of (thousands of) text files in many ways, including:
- Indexing morpho-syntactic patterns,
- Cataloging fixed or semi-fixed expressions (e.g. technical expressions),
- Creation of lemmatizedLemmatisationLemmatisation in linguistics, is the process of grouping together the different inflected forms of a word so they can be analysed as a single item....
concordances, and - Statistical analysis of the results.
Modules for several languages are currently available for free download: Arabic, Armenian, Bulgarian, Catalan, Chinese, Croatian, English, French, German, Hebrew, Hungarian, Italian, Polish, Portuguese and Spanish. Several other modules are under development.
NooJ's most unique characteristics are:
- Ability to process from 100+ file formats, including HTMLHTMLHyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, PDF, MS Office, all variants of UnicodeUnicodeUnicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems...
, ASCIIASCIIThe American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
, etc. It can import information from, and export annotations back to XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
documents. - An annotation system that allows any level of grammar to be applied, yet leaves original text unmodified. This allows linguists to formalize various phenomena independently and to apply the corresponding grammars in cascade. For instance, by combining inflectionInflectionIn grammar, inflection or inflexion is the modification of a word to express different grammatical categories such as tense, grammatical mood, grammatical voice, aspect, person, number, gender and case...
, derivationDerivation (linguistics)In linguistics, derivation is the process of forming a new word on the basis of an existing word, e.g. happi-ness and un-happy from happy, or determination from determine...
and syntactic data, NooJ can perform Zellig HarrisZellig HarrisZellig Sabbettai Harris was a renowned American linguist, mathematical syntactician, and methodologist of science. Originally a Semiticist, he is best known for his work in structural linguistics and discourse analysis and for the discovery of transformational structure in language...
-type transformations.
NooJ can be used as a linguistic-engineering development platform, a corpus processor, an information-extraction
Information extraction
Information extraction is a type of information retrieval whose goal is to automatically extract structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language...
system, a terminology extractor
Terminology extraction
Terminology mining, term extraction, term recognition, or glossary extraction, is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus....
, a machine-translation
Machine translation
Machine translation, sometimes referred to by the abbreviation MT is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another.On a basic...
development tool, as well as to teach Linguistics
Linguistics
Linguistics is the scientific study of human language. Linguistics can be broadly broken into three categories or subfields of study: language form, language meaning, and language in context....
and Computational Linguistics
Computational linguistics
Computational linguistics is an interdisciplinary field dealing with the statistical or rule-based modeling of natural language from a computational perspective....
.
Technology
The author followed a Component-Based SoftwareComponent-based software engineering
Component-based software engineering is a branch of software engineering that emphasizes the separation of concerns in respect of the wide-ranging functionality available throughout a given software system...
approach for building NooJ. Although originally, he used Java/J2EE framework
Java Platform, Enterprise Edition
Java Platform, Enterprise Edition or Java EE is widely used platform for server programming in the Java programming language. The Java platform differs from the Java Standard Edition Platform in that it adds libraries which provide functionality to deploy fault-tolerant, distributed, multi-tier...
, he then switched to C#/.NET framework
.NET Framework
The .NET Framework is a software framework that runs primarily on Microsoft Windows. It includes a large library and supports several programming languages which allows language interoperability...
thus giving NooJ a number of additional capabilities including the automatic management of hundreds of text encodings and formats, native XML compatibility (both for parsing XML documents and storing objects (XML/SOAP)); the ASP.NET library allows NooJ to be easily transformed into a WEB server application; .NET Services and Remoting technology allows NooJ’s functionality to be available as independent agents that run in parallel, etc.
System requirements
NooJ is a .NET application. It currently runs under Windows 95-98-ME, Windows NT-2000, Windows XP and Windows VISTA, although some of its functionalities (e.g. UNICODE and XML support) are only available with Windows 2000, Windows XP and Windows VISTA. As for any application, it is strongly advised that you update both your operating system and the .NET Framework, by downloading their latest “Service Pack”.The MONO and the DOTGNU projects aim at building a .NET computing environment (i.e. virtual machine) for LINUX
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
, FreeBSD
FreeBSD
FreeBSD is a free Unix-like operating system descended from AT&T UNIX via BSD UNIX. Although for legal reasons FreeBSD cannot be called “UNIX”, as the direct descendant of BSD UNIX , FreeBSD’s internals and system APIs are UNIX-compliant...
, Mac OSX
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...
as well as several variants of UNIX
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...
. So far, noojapply.exe on MONO
Mono (software)
Mono, pronounced , is a free and open source project led by Xamarin to create an Ecma standard compliant .NET-compatible set of tools including, among others, a C# compiler and a Common Language Runtime....
have been successfully tested, but NooJ.exe does not run yet on MONO. For more information, see: http://www.mono-project.com and http://www.dotgnu.org
Minimum requirements for a computer to run NooJ on small texts (less than one Mega byte) are not very high: 512 Mb of RAM, 1 GB available on the hard drive.
If you plan to use NooJ to parse large corpora (hundreds or thousands of text files), or to compile large-coverage dictionaries (tens of thousands of entries or more), the minimum configuration should be higher: PC with Pentium 4 or equivalent, 2 GB RAM or more.
If you are planning to use NooJ to develop large sets of local grammars (hundreds of graphs), a good screen is necessary: at least a 19 inch screen, with a 1600x1024 16-bit resolution, and a minimum of 80 Hz refresh rate.
Computational devices
NooJ’s linguistic engine includes several computational devices used both to formalize linguistic phenomena and to parse texts.- Finite-State Transducers (FSTFinite state transducerA finite state transducer is a finite state machine with two tapes: an input tape and an output tape. This contrasts with an ordinary finite state automaton , which has a single tape.-Overview:...
in general) - Finite-State Automata (FSAFinite state machineA finite-state machine or finite-state automaton , or simply a state machine, is a mathematical model used to design computer programs and digital logic circuits. It is conceived as an abstract machine that can be in one of a finite number of states...
in general) - Recursive Transition Networks (RTNsRecursive transition networkA recursive transition network is a graph theoretical schematic used to represent the rules of a context free grammar. RTNs have application to programming languages, natural language and lexical analysis...
in general) - Enhanced Recursive Transition Networks (ERTNs in general)
- Regular Expressions (RegEx in general)
- Context-Free Grammars (CFGsContext-free grammarIn formal language theory, a context-free grammar is a formal grammar in which every production rule is of the formwhere V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals ....
in general)
A Finite-State Transducer (FST) is a graph that represents a set of text sequences and then associates each recognized sequence with some analysis result. The text sequences are described in the input part of the FST; the corresponding results are described in the output part of the FST.
Typically, a syntactic FST represents word sequences, and then produces linguistic information (such as its phrasal structure). A morphological FST represents sequences of letters that spell a word form, and then produces lexical information (such as a part of speech, a set of morphological, syntactic and semantic codes).
In NooJ, Finite-State Automata are a special case of Finite-State Transducers that do not produce any result (i.e. they have no output). NooJ’s users typically use FSA to locate morpho-syntactic patterns in corpora, and extract the matching sequences to build indices, concordances, etc.
Recursive Transition Networks are grammars that contain more than one graph; graphs can be FST or FSA, and also include references to other, embedded graphs; these latter graphs may in turn contain other references, to the same, or to other graphs. Generally, RTNs are used in NooJ to build libraries of graphs from the bottom-up: simple graphs are designed; then, they are re-used in more general graphs; these ones in turn are re-used, etc.
Enhanced Recursive Transition Networks are RTNs that contain variables; these variables typically store parts of the matching sequences, and then are used to perform some operation with them (e.g. put their content in the plural,, etc.), and then produce the resulting output.
Because variables can be duplicated, inserted and/or displaced in the output, ERTNs give NooJ the power of performing linguistic transformations on texts. Examples of transformations include negation, passivization, nominalization, etc.
Regular Expressions constitute also a quick way to enter simple queries without having to construct grammars. When the sequence to be located consists of a few words, it is much quicker to enter these words directly into a regular expression. However, as the query becomes more and more complex as is usually the case in Linguistics, one should build a grammar.
In NooJ, CFGs constitute an alternative means to enter morphological or syntactic grammars.
For instance, NooJ includes an inflectional/derivational module that is associated with its dictionaries, so that it can automatically link dictionary entries with their corresponding forms that occur in corpora (this functionality allows NooJ to get rid of INTEX’s full form dictionaries such as DELAF and DELACFs).
NooJ dictionaries generally associate each lexical entry with an inflectional and/or derivational paradigm. For instance, all the verbs that conjugate like “aimer” are linked to the paradigm “+FLX=AIMER”; all the verbs that accept the “-able” suffix are linked to the paradigm “+DRV=ABLE”, etc.
Paradigms such as “AIMER” or “ABLE” are described either graphically in RTNs or by CFGs in text files.
Linguistic Resources
With NooJ, linguists build, test and maintain two basic types of linguistic resources:- Dictionaries ( .dic files)
- usually associate words or expressions with a set of information, such as:
- a category (e.g. “Verb”),
- one or more inflectional and/or derivational paradigms (e.g. how to conjugate verbs, how to nominalize them),
- one or more syntactic properties (e.g. “+transitive” or +N0VN1PREPN2),
- one or more semantic properties (e.g. distributional classes such as “+Human”, domain classes such as “+Politics”).
- Lexical Properties can be binary, such as “+plural” or can be expressed as an attribute-value pairAttribute-value pairA name–value pair, key–value pair, field–value pair or attribute–value pair is a fundamental data representation in computing systems and applications. Designers often desire an open-ended data structure that allows for future extension without modifying existing code or data...
, such as “+gender=plural”. - Values can belong to the meta-language, such as in “+gender=plural”, to the input language such as in “+synonym=pencil” or to another language, such as in “+FR=crayon”.
- NooJ’s dictionaries constitute a converged and enhanced version of the DELA-type dictionaries that were used in INTEX: a NooJ dictionary can include
- simple words (like a DELAS),
- multi-word units (like a DELAC) and
- can link lexical entries to a canonical form (like a DELAV).
- Contrary to INTEX, NooJ does not need full inflected form dictionaries (no more DELAF or DELACF).
- NooJ’s ability to type pieces of information (e.g. “masculine” is a value of the “gender” property) allows it to process lexicon-grammar tables as well. Indeed, NooJ can display any dictionary in a “list” form or in a “table” form.
- Grammars
- are used to represent a large gamut of linguistic phenomena, from the orthographical and the morphological levels, up to the syntagmatic and transformational syntactic levels.
- In NooJ, there are different types of grammars. NooJ’s three types of grammars are:
- Inflectional and derivational grammars ( .nof files) are used to represent the inflection (e.g. conjugation) or the derivation (e.g. nominalization) properties of lexical entries. These descriptions can be entered either graphically or in the form of rules.
- Lexical, orthographical, morphological or terminological grammars ( .nom files) are used to represent sets of word forms, and associate them with lexical information, e.g. to standardize the spelling of word or term variants, to recognize and tag neologisms, to link synonymous expressions together;
- Syntactic or semantic grammars ( .nog files) are used to recognize and annotate expressions in texts, e.g. to tag noun phrases, certain syntactic constructs or idiomatic expressions, to extract certain expressions or interest (name of companies, expressions of dates, addresses, etc.), or to disambiguate words by filtering out some lexical or syntactic annotations in the text.
Using NooJ functionalities
In its Standard edition, NooJ’s functions are available via a command-line program: noojapply.exe, which is stored in NooJ’s _App directory along Nooj.exe.noojapply.exe can be called either directly from a “SHELL” script, or from more sophisticated programs written in Perl, C++, Java, etc.
noojapply.exe allows users to apply to texts and corpora dictionaries and grammars automatically.
If you are planning to use NooJ’s functionalities in a professional environment (e.g. build a linguistic research engine), note that they are also available via:
a .NET dynamic library, noojengine.dll, constituted by a set of public object classes and methods. These classes and methods can be used by any .NET application, in any NET programming language. noojengine.dll allows users to build sophisticated applications such as WEB services, and can be much used to build much more efficient NLP applications than noojapply.exe.
a noojservice.exe / noojclient.exe client-server application, based on a Windows service, that provides NooJ’s morphological and syntactic parsers functionalities in a Multi-Agent System, that can be used to build a massively parallel NLP application.
NooJ Users
NooJ can be freely downloaded.Most laboratories and academic centers use NooJ as a research or educational tool: some users are interested in its Corpus processing functionalities (analysis of literary text, research and extract information from newspapers or technical corpora, etc.); others use NooJ to formalize certain linguistic phenomena (e.g. describe a language’s morphology), others for computational applications (automatic text analysis), etc.
Among NooJ users, some are actively helping the NooJ project, by giving away some of their linguistic resources, projects or demos, labs, tutorials or documentations. These users, who constitute “NooJ’s community”, should be considered as NooJ’s “co-authors”. The Community Edition of the NooJ application (which is also free), is an extended version of NooJ, that gives full access to its internal functionalities as well as privileged access to sources of its linguistic resources.
NooJ users meet once a year at the NooJ conference. NooJ tutorials and workshops are regularly organized during the year.
NooJ Conferences
- http://lt.ffzg.hr/nooj2011/NooJ 2011, DubrovnikDubrovnikDubrovnik is a Croatian city on the Adriatic Sea coast, positioned at the terminal end of the Isthmus of Dubrovnik. It is one of the most prominent tourist destinations on the Adriatic, a seaport and the centre of Dubrovnik-Neretva county. Its total population is 42,641...
, CroatiaCroatiaCroatia , officially the Republic of Croatia , is a unitary democratic parliamentary republic in Europe at the crossroads of the Mitteleuropa, the Balkans, and the Mediterranean. Its capital and largest city is Zagreb. The country is divided into 20 counties and the city of Zagreb. Croatia covers ...
] - http://www.gavriilidou.gr/nooj2010/en/epiloges/programmeNooJ 2010, KomotiniKomotiniKomotini is a city in Thrace, northeastern Greece. It is the capital of the region of East Macedonia and Thrace and of the Rhodope regional unit. It is also the administrative center of the Rhodope-Evros super-prefecture. The city is home to the Democritus University of Thrace, founded in 1973...
, GreeceGreeceGreece , officially the Hellenic Republic , and historically Hellas or the Republic of Greece in English, is a country in southeastern Europe....
] - http://www.miracl.rnu.tn/nooj/index.php?option=com_content&view=article&id=5&Itemid=5&lang=enNooJ 2009, TozeurTozeurTozeur is an oasis and a city in south west Tunisia. The city is located North West of Chott el-Djerid, in between this Chott and the smaller Chott el-Gharsa. It is the capital of the Tozeur Governorate....
, TunisiaTunisiaTunisia , officially the Tunisian RepublicThe long name of Tunisia in other languages used in the country is: , is the northernmost country in Africa. It is a Maghreb country and is bordered by Algeria to the west, Libya to the southeast, and the Mediterranean Sea to the north and east. Its area...
] - http://www.nytud.hu/nooj08/programme.htmlNooJ 2008, BudapestBudapestBudapest is the capital of Hungary. As the largest city of Hungary, it is the country's principal political, cultural, commercial, industrial, and transportation centre. In 2011, Budapest had 1,733,685 inhabitants, down from its 1989 peak of 2,113,645 due to suburbanization. The Budapest Commuter...
, HungaryHungaryHungary , officially the Republic of Hungary , is a landlocked country in Central Europe. It is situated in the Carpathian Basin and is bordered by Slovakia to the north, Ukraine and Romania to the east, Serbia and Croatia to the south, Slovenia to the southwest and Austria to the west. The...
] - [NooJ 2007, BarcelonaBarcelonaBarcelona is the second largest city in Spain after Madrid, and the capital of Catalonia, with a population of 1,621,537 within its administrative limits on a land area of...
, SpainSpainSpain , officially the Kingdom of Spain languages]] under the European Charter for Regional or Minority Languages. In each of these, Spain's official name is as follows:;;;;;;), is a country and member state of the European Union located in southwestern Europe on the Iberian Peninsula...
] - [NooJ 2006, Belgrade, Srbia]
- [NooJ 2005, BesançonBesançonBesançon , is the capital and principal city of the Franche-Comté region in eastern France. It had a population of about 237,000 inhabitants in the metropolitan area in 2008...
, FranceFranceThe French Republic , The French Republic , The French Republic , (commonly known as France , is a unitary semi-presidential republic in Western Europe with several overseas territories and islands located on other continents and in the Indian, Pacific, and Atlantic oceans. Metropolitan France...
]
Further reading
Abdelmajid Ben Hamadou, Slim Mesfar, Max Silberztein (Eds): Finite State Language Engineering: NooJ 2009 International Conference and Workshop (Touzeur), Centre de Publication Universitaire, 2010.
- Xavier Blanco, Max Silberztein (Eds): Proceedings of the 2007 International NooJ Conference (Barcelona), Cambridge Scholars Publishing (18 selected papers, 296 pages), 2008.
- Svetla Koeva, Denis Maurel, Max Silberztein (Eds): Formaliser les langues avec l'ordinateur : de INTEX à NooJ, Cahiers de la MSH Ledoux, Presses Universitaires de Franche-Comté (23 articles, 438 pages), 2007.