Data dictionary
Encyclopedia
A data dictionary, or metadata
repository
, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to database
s and database management systems
(DBMS):
If a data dictionary system is used only by the designers, users, and administrators and not by the DBMS Software , it is called a Passive Data Dictionary; otherwise, it is called an Active Data Dictionary or Data Dictionary. An Active Data Dictionary is automatically updated as changes occur in the database. A Passive Data Dictionary must be manually updated.
The data Dictionary consists of record types (tables) created in the database by systems generated command files, tailored for each supported back-end DBMS. Command files contain SQL Statements for CREATE TABLE, CREATE UNIQUE INDEX, ALTER TABLE (for referential integrity), etc., using the specific statement required by that type of database.
Database users
and application
developers can benefit from an authoritative data dictionary document that catalogs the organization, contents, and conventions of one or more databases. This typically includes the names and descriptions of various tables
and fields
in each database, plus additional details, like the type
and length of each data element
. There is no universal standard as to the level of detail in such a document, but it is primarily a weak kind of data.
, which communicates with the underlying DBMS data dictionary. Such a "high-level" data dictionary may offer additional features and a degree of flexibility that goes beyond the limitations of the native "low-level" data dictionary, whose primary purpose is to support the basic functions of the DBMS, not the requirements of a typical application. For example, a high-level data dictionary can provide alternative entity-relationship model
s tailored to suit different applications that share a common database. Extensions to the data dictionary also can assist in query optimization
against distributed databases.
Software framework
s aimed at rapid application development
sometimes include high-level data dictionary facilities, which can substantially reduce the amount of programming required to build menus
, forms
, reports, and other components of a database application, including the database itself. For example, PHPLens includes a PHP
class library to automate the creation of tables, indexes, and foreign key
constraints portably for multiple databases. Another PHP-based data dictionary, part of the RADICORE toolkit, automatically generates program objects
, scripts
, and SQL code for menus and forms with data validation
and complex JOIN
s. For the ASP.NET
environment, Base One's
data dictionary provides cross-DBMS facilities for automated database creation, data validation, performance enhancement (caching
and index utilization), application security
, and extended data type
s.
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
repository
Software repository
A software repository is a storage location from which software packages may be retrieved and installed on a computer.- Discussion :Many software publishers and other organizations maintain servers on the Internet for this purpose, either free of charge or for a subscription fee...
, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
s and database management systems
Database management system
A database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
(DBMS):
- a documentDocumentThe term document has multiple meanings in ordinary language and in scholarship. WordNet 3.1. lists four meanings :* document, written document, papers...
describing a database or collection of databases - an integral component of a DBMSDatabase management systemA database management system is a software package with computer programs that control the creation, maintenance, and use of a database. It allows organizations to conveniently develop databases for various applications by database administrators and other specialists. A database is an integrated...
that is required to determine its structure - a piece of middlewareMiddlewareMiddleware is computer software that connects software components or people and their applications. The software consists of a set of services that allows multiple processes running on one or more machines to interact...
that extends or supplants the native data dictionary of a DBMS
Documentation
The term Data Dictionary and Data Repository are used to indicate a more general software utility than a catalogue. A Catalogue is closely coupled with the DBMS Software; it provides the information stored in it to user and the DBA, but it is mainly accessed by the various software modules of the DBMS itself, such as DDL and DML compilers, the query optimiser, the transaction processor, report generators, and the constraint enforcer. On the other hand, a Data Dictionary is a data structure that stores meta-data, i.e., data about data. The Software package for a stand-alone Data Dictionary or Data Repository may interact with the software modules of the DBMS, but it is mainly used by the Designers, Users and Administrators of a computer system for information resource management. These systems are used to maintain information on system hardware and software configuration, documentation, application and users as well as other information relevant to system administration.If a data dictionary system is used only by the designers, users, and administrators and not by the DBMS Software , it is called a Passive Data Dictionary; otherwise, it is called an Active Data Dictionary or Data Dictionary. An Active Data Dictionary is automatically updated as changes occur in the database. A Passive Data Dictionary must be manually updated.
The data Dictionary consists of record types (tables) created in the database by systems generated command files, tailored for each supported back-end DBMS. Command files contain SQL Statements for CREATE TABLE, CREATE UNIQUE INDEX, ALTER TABLE (for referential integrity), etc., using the specific statement required by that type of database.
Database users
User (computing)
A user is an agent, either a human agent or software agent, who uses a computer or network service. A user often has a user account and is identified by a username , screen name , nickname , or handle, which is derived from the identical Citizen's Band radio term.Users are...
and application
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...
developers can benefit from an authoritative data dictionary document that catalogs the organization, contents, and conventions of one or more databases. This typically includes the names and descriptions of various tables
Table (database)
In relational databases and flat file databases, a table is a set of data elements that is organized using a model of vertical columns and horizontal rows. A table has a specified number of columns, but can have any number of rows...
and fields
Column (database)
In the context of a relational database table, a column is a set of data values of a particular simple type, one for each row of the table. The columns provide the structure according to which the rows are composed....
in each database, plus additional details, like the type
Data type
In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...
and length of each data element
Data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...
. There is no universal standard as to the level of detail in such a document, but it is primarily a weak kind of data.
Middleware
In the construction of database applications, it can be useful to introduce an additional layer of data dictionary software, i.e. middlewareMiddleware
Middleware is computer software that connects software components or people and their applications. The software consists of a set of services that allows multiple processes running on one or more machines to interact...
, which communicates with the underlying DBMS data dictionary. Such a "high-level" data dictionary may offer additional features and a degree of flexibility that goes beyond the limitations of the native "low-level" data dictionary, whose primary purpose is to support the basic functions of the DBMS, not the requirements of a typical application. For example, a high-level data dictionary can provide alternative entity-relationship model
Entity-relationship model
In software engineering, an entity-relationship model is an abstract and conceptual representation of data. Entity-relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements...
s tailored to suit different applications that share a common database. Extensions to the data dictionary also can assist in query optimization
Query optimization
Query optimization is a function of many relational database management systems in which multiple query plans for satisfying a query are examined and a good query plan is identified. This may or not be the absolute best strategy because there are many ways of doing plans. There is a trade-off...
against distributed databases.
Software framework
Software framework
In computer programming, a software framework is an abstraction in which software providing generic functionality can be selectively changed by user code, thus providing application specific software...
s aimed at rapid application development
Rapid application development
Rapid application development is a software development methodology that uses minimal planning in favor of rapid prototyping. The "planning" of software developed using RAD is interleaved with writing the software itself...
sometimes include high-level data dictionary facilities, which can substantially reduce the amount of programming required to build menus
Menu (computing)
In computing and telecommunications, a menu is a list of commands presented to an operator by a computer or communications system. A menu is used in contrast to a command-line interface, where instructions to the computer are given in the form of commands .Choices given from a menu may be selected...
, forms
Form (programming)
In component-based programming , a form is an easy way to create a GUI window. A form contains components and controls, which are a high-level representation of standard GUI widgets; it's easier to manipulate the high-level wrappers than to deal with the underlying API.At design time, visual...
, reports, and other components of a database application, including the database itself. For example, PHPLens includes a PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
class library to automate the creation of tables, indexes, and foreign key
Foreign key
In the context of relational databases, a foreign key is a referential constraint between two tables.A foreign key is a field in a relational table that matches a candidate key of another table...
constraints portably for multiple databases. Another PHP-based data dictionary, part of the RADICORE toolkit, automatically generates program objects
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...
, scripts
Scripting language
A scripting language, script language, or extension language is a programming language that allows control of one or more applications. "Scripts" are distinct from the core code of the application, as they are usually written in a different language and are often created or at least modified by the...
, and SQL code for menus and forms with data validation
Data validation
In computer science, data validation is the process of ensuring that a program operates on clean, correct and useful data. It uses routines, often called "validation rules" or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system...
and complex JOIN
Join
Join may refer to:* Join , to include additional counts or additional defendants on an indictment* Join , a least upper bound of set orders in lattice theory* Join , a type of binary operator...
s. For the ASP.NET
ASP.NET
ASP.NET is a Web application framework developed and marketed by Microsoft to allow programmers to build dynamic Web sites, Web applications and Web services. It was first released in January 2002 with version 1.0 of the .NET Framework, and is the successor to Microsoft's Active Server Pages ...
environment, Base One's
Base One International
Base One International Corp. develops software for constructing database applications and distributed computing systems. Headquartered in New York City, the company was founded in 1993 and expanded in 1997 through the founding of its subsidiary, Base One Software Pvt. Ltd., in Bangalore, India....
data dictionary provides cross-DBMS facilities for automated database creation, data validation, performance enhancement (caching
Cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...
and index utilization), application security
Application security
Application security encompasses measures taken throughout the application's life-cycle to prevent exceptions in the security policy of an application or the underlying system through flaws in the design, development, deployment, upgrade, or maintenance of the application.Applications only...
, and extended data type
Data type
In computer programming, a data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; the meaning of the data; and the way values of...
s.
See also
- Vocabulary OneSourceVocabulary OneSourceOneSource is an evolving data analysis tool used internally by the Air Force Command and Control Integration Center's Vocabulary Services Team, and made available to general data management community. It is used by the greater US Department of Defense and NATO community for controlled vocabulary...
- MetadataMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
- Data modelingData modelingData modeling in software engineering is the process of creating a data model for an information system by applying formal data modeling techniques.- Overview :...
- ISO/IEC 11179ISO/IEC 11179ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...
- Metadata registryMetadata registryA metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...
- Semantic spectrumSemantic spectrumThe semantic spectrum is a series of increasingly precise or rather semantically expressive definitions for data elements in knowledge representations, especially for machine use.At the low end of the spectrum is a simple binding of a single word or phrase and its...
- Data hierarchyData hierarchyData Hierarchy refers to the systematic organization of data, often in a hierarchical form. Data organization involves fields, records, files and so on....
External links
- Yourdon, Structured Analysis Wiki, Data Dictionaries