Semantics encoding
Encyclopedia
A semantics encoding is a translation between formal language
s. For programmers, the most familiar form of encoding is the compilation of a programming language into machine code or byte-code. Conversion between document formats are also forms of encoding. Compilation of TeX
or LaTeX
documents to PostScript
are also commonly encountered encoding processes. Some high-level preprocessors such as Objective Caml
's Camlp4
or Apple Computer
's WorldScript
also involve encoding of a programming language into another.
Formally, an encoding of a language A into language B is a mapping of all terms of A into B. If there is a satisfactory encoding of A into B, B is considered at least as powerful (or at least as expressive) as A.
Commonly, an encoding is expected to preserve a number of properties.
completeness : For every n-ary operator of A, there exists an n-ary operator of B such that
(Note: as far as the author is aware of, this criterion of completeness is never used.)
Preservation of compositions is useful insofar as it guarantees that components can be examined either separately or together without "breaking" any interesting property. In particular, in the case of compilations, this soundness guarantees the possibility of proceeding with separate compilation of components, while completeness guarantees the possibility of de-compilation.
We write for one step of reduction and for any number of steps of reduction.
soundness : For every terms of language A, if then .
completeness : For every term of language A and every terms of language B, if then there exists some such that .
This preservation guarantees that both languages behave the same way. Soundness guarantees that all possible behaviours are preserved while completeness guarantees that no behaviour is added by the encoding. In particular, in the case of compilation of a programming language, soundness and completeness together mean that the compiled program behaves accordingly to the high-level semantics of the programming language.
soundness : for any term , if all reductions of converge, then all reductions of converge.
completeness : for any term , if all reductions of converge, then all reductions of converge.
In the case of compilation of a programming language, soundness guarantees that the compilation does not introduce non-termination such as endless loops or endless recursions. The completeness property is useful when language B is used to study or test a program written in language A, possibly by extracting key parts of the code: if this study or test proves that the program terminates in B, then it also terminates in A.
, a typical observable is the result of page rendering.
soundness : for every observable on terms of A, there exists an observable of terms of B such that for any term with observable , has observable .
completeness : for every observable on terms of A, there exists an observable on terms of B such that for any term with observable , has observable .
soundness : for every terms , if simulates then simulates .
completeness : for every terms , if simulates then simulates .
Preservation of simulations is a much stronger property than preservation of observations, which it entails. In turn, it is weaker than a property of preservation of bisimulation
s. As in previous cases, soundness is important for compilation, while completeness is useful for testing or proving properties.
soundness : if two terms and are equivalent in A, then and are equivalent in B.
completeness : if two terms and are equivalent in B, then and are equivalent in A.
or E, this means distribution of processes and data among several computers or CPUs.
soundness : if a term is the composition of two agents then must be the composition of two agents .
completeness : if a term is the composition of two agents then must be the composition of two agents such that and .
Formal language
A formal language is a set of words—that is, finite strings of letters, symbols, or tokens that are defined in the language. The set from which these letters are taken is the alphabet over which the language is defined. A formal language is often defined by means of a formal grammar...
s. For programmers, the most familiar form of encoding is the compilation of a programming language into machine code or byte-code. Conversion between document formats are also forms of encoding. Compilation of TeX
TeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....
or LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...
documents to PostScript
PostScript
PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...
are also commonly encountered encoding processes. Some high-level preprocessors such as Objective Caml
Objective Caml
OCaml , originally known as Objective Caml, is the main implementation of the Caml programming language, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996...
's Camlp4
Camlp4
Camlp4 is a software system for writing extensible parsers for programming languages. It provides a set of Objective Caml libraries that are used to define grammars as well as loadable syntax extensions of such grammars...
or Apple Computer
Apple Computer
Apple Inc. is an American multinational corporation that designs and markets consumer electronics, computer software, and personal computers. The company's best-known hardware products include the Macintosh line of computers, the iPod, the iPhone and the iPad...
's WorldScript
WorldScript
WorldScript was the multilingual text rendering engine for Apple Macintosh before Mac OS X was introduced.Starting with version 7.1, Apple unified the implementation of non-Roman script systems in a programming interface called WorldScript. WorldScript I was used for all one-byte character sets and...
also involve encoding of a programming language into another.
Formally, an encoding of a language A into language B is a mapping of all terms of A into B. If there is a satisfactory encoding of A into B, B is considered at least as powerful (or at least as expressive) as A.
Properties
An informal notion of translation is not sufficient to help determine expressivity of languages, as it permits trivial encodings such as mapping all elements of A to the same element of B. Therefore, it is necessary to determine the definition of a "good enough" encoding. This notion varies with the application.Commonly, an encoding is expected to preserve a number of properties.
Preservation of compositions
soundness : For every n-ary operator of A, there exists an n-ary operator of B such thatcompleteness : For every n-ary operator of A, there exists an n-ary operator of B such that
(Note: as far as the author is aware of, this criterion of completeness is never used.)
Preservation of compositions is useful insofar as it guarantees that components can be examined either separately or together without "breaking" any interesting property. In particular, in the case of compilations, this soundness guarantees the possibility of proceeding with separate compilation of components, while completeness guarantees the possibility of de-compilation.
Preservation of reductions
This assumes the existence of a notion of reduction on both language A and language B. Typically, in the case of a programming language, reduction is the relation which models the execution of a program.We write for one step of reduction and for any number of steps of reduction.
soundness : For every terms of language A, if then .
completeness : For every term of language A and every terms of language B, if then there exists some such that .
This preservation guarantees that both languages behave the same way. Soundness guarantees that all possible behaviours are preserved while completeness guarantees that no behaviour is added by the encoding. In particular, in the case of compilation of a programming language, soundness and completeness together mean that the compiled program behaves accordingly to the high-level semantics of the programming language.
Preservation of termination
This also assumes the existence of a notion of reduction on both language A and language B.soundness : for any term , if all reductions of converge, then all reductions of converge.
completeness : for any term , if all reductions of converge, then all reductions of converge.
In the case of compilation of a programming language, soundness guarantees that the compilation does not introduce non-termination such as endless loops or endless recursions. The completeness property is useful when language B is used to study or test a program written in language A, possibly by extracting key parts of the code: if this study or test proves that the program terminates in B, then it also terminates in A.
Preservation of observations
This assumes the existence of a notion of observation on both language A and language B. In programming languages, typical observables are results of inputs and outputs, by opposition to pure computation. In a description language such as HTMLHTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
, a typical observable is the result of page rendering.
soundness : for every observable on terms of A, there exists an observable of terms of B such that for any term with observable , has observable .
completeness : for every observable on terms of A, there exists an observable on terms of B such that for any term with observable , has observable .
Preservation of simulations
This assumes the existence of notion of simulation on both language A and language B. In a programming languages, a program simulates another if it can perform all the same (observable) tasks and possibly some others. Simulations are used typically to describe compile-time optimizations.soundness : for every terms , if simulates then simulates .
completeness : for every terms , if simulates then simulates .
Preservation of simulations is a much stronger property than preservation of observations, which it entails. In turn, it is weaker than a property of preservation of bisimulation
Bisimulation
In theoretical computer science a bisimulation is a binary relation between state transition systems, associating systems which behave in the same way in the sense that one system simulates the other and vice-versa....
s. As in previous cases, soundness is important for compilation, while completeness is useful for testing or proving properties.
Preservation of equivalences
This assumes the existence of a notion of equivalence on both language A and language B. Typically, this can be a notion of equality of structured data or a notion of syntactically different yet semantically identical programs, such as structural congruence or structural equivalence.soundness : if two terms and are equivalent in A, then and are equivalent in B.
completeness : if two terms and are equivalent in B, then and are equivalent in A.
Preservation of distribution
This assumes the existence of a notion of distribution on both language A and language B. Typically, for compilation of distributed programs written in Acute, JoCamlJoCaml
JoCaml is an experimental functional programming language derived from OCaml. It integrates the primitives of the join-calculus to enable flexible, type-checked concurrent and distributed programming.- External links :* *...
or E, this means distribution of processes and data among several computers or CPUs.
soundness : if a term is the composition of two agents then must be the composition of two agents .
completeness : if a term is the composition of two agents then must be the composition of two agents such that and .
See also
- BisimulationBisimulationIn theoretical computer science a bisimulation is a binary relation between state transition systems, associating systems which behave in the same way in the sense that one system simulates the other and vice-versa....
- CompilerCompilerA compiler is a computer program that transforms source code written in a programming language into another computer language...
- SemanticsSemanticsSemantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
- Semantic dictionary encodingSemantic dictionary encodingSemantic dictionary encoding preserves the full semantic context of source programs while adding further information that can be used for accelerating the speed of code generation. SDE forms a code-generating loader. It is a form of bytecode combined with a JIT compiler...
(SDE)