Metadata publishing
Encyclopedia
Metadata publishing is the process of making metadata
data element
s available to external users, both people and machines using a formal review process and a commitment to change control processes.
Metadata publishing is the foundation upon which advanced distributed computing
functions are being built. But like building foundations, care must be taken in metadata publishing systems to ensure the structural integrity of the systems built on top of them.
tool.
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
data element
Data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...
s available to external users, both people and machines using a formal review process and a commitment to change control processes.
Metadata publishing is the foundation upon which advanced distributed computing
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...
functions are being built. But like building foundations, care must be taken in metadata publishing systems to ensure the structural integrity of the systems built on top of them.
Definition of metadata publishing
Published metadata has the following characteristics:- Metadata structures available to the general public on a public web site or by a download
- There is a documented review and approval process for adding or updating data elements to the system
- New releases are made available without disturbing prior versions
- A publishing organization that makes a commitment to change control process
Benefits of metadata publishing
When classifying benefits of metadata publishing two groups are usually considered. External parties are usually consumers of information that are not part of the publishing organization. Internal parties are usually the various business units or departments within an organization.Benefits to external parties
- Allows external systems (both people and agents) to have a clear understanding of the semanticsSemanticsSemantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
of data elementData elementIn metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...
s in a system - Allows third parties to build semantic maps between data modelData modelA data model in software engineering is an abstract model, that documents and organizes the business data for communication between team members and is used as a plan for developing applications, specifically how data is stored and accessed....
s and import and export data between systems - Promotes service oriented architectures and allow horizontal sharing of information between traditional information siloInformation siloAn information silo is a management system incapable of reciprocal operation with other, related management systems. A bank's management system, for example, is considered a silo if it cannot exchange information with other related systems within its own organization, or with the management systems...
s - Allows systems to participate in accurately indexed and federated searchFederated searchFederated search is an information retrieval technology that allows the simultaneous search of multiple searchable resources. A user makes a single query request which is distributed to the search engines participating in the federation...
processes
Benefits to internal parties
- allows parties from diverse business units to agree on shared data definitions and separate department or function specific definitions
- makes Extract, transform, loadExtract, transform, loadExtract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
(ETL) operations more precise for data warehousing - allows user interface designers to access a common pool of screen and report header labels
- promotion of model-driven architectureModel-driven architectureModel-driven architecture is a software design approach for the development of software systems. It provides a set of guidelines for the structuring of specifications, which are expressed as models. Model-driven architecture is a kind of domain engineering, and supports model-driven engineering of...
Objections to metadata publishing
- Organizations that publish their metadata could make it easier for unauthorized people to find sensitive data if they breach an organization's firewall
- Vendors that publish their metadata risk customers creating tools that could allow their customers to export their data from computer systems therefor making it easier to migrate off of a vendor's system
Core process in metadata publishing
The following are some of the core processes in metadata publishing- Gathering of metadata requirements
- Selection of metadata registry and metadata publishing tools
- Training of metadata concepts to project participants
- Stakeholder group formation
- Metadata harvesting
- Glossary consolidation
- Initial upper ontology construction (abstract data elements)
- Draft data element loading
- Data element review process
- Publishing approved metadata elements in a variety of output formats (see below)
- Creation and maintenance of versions and depreciation of unused or redundant data elements
File format metadata publishing
Organizations that create applications that store data in file systems can also publish metadata definitions. One common way to perform this is to store application data in a compressed XML file format. The XML files can be uncompressed and validated against an external XML Schema. An example of this is done by the Open Source FreeMindFreeMind
FreeMind is a free mind mapping application written in Java. FreeMind is licensed under the GNU General Public License. It provides extensive export capabilities. It runs on Microsoft Windows, Linux and Mac OS X via the Java Runtime Environment....
tool.
Metadata publishing formats
- HTMLHTMLHyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
- used for browsing a web site and indexing by text-based search engines - Web Ontology LanguageWeb Ontology LanguageThe Web Ontology Language is a family of knowledge representation languages for authoring ontologies.The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web...
(OWL) - used by metadata search engines such as SwoogleSwoogleSwoogle is a search engine for Semantic Web ontologies, documents, terms and data published on the Web. Swoogle employs a system of crawlers to discover RDF documents and HTMLdocuments with embedded RDF content... - XML Metadata InterchangeXML Metadata InterchangeThe XML Metadata Interchange is an Object Management Group standard for exchanging metadata information via Extensible Markup Language .It can be used for any metadata whose metamodel can be expressed in Meta-Object Facility ....
(XMI) - OMGObject Management GroupObject Management Group is a consortium, originally aimed at setting standards for distributed object-oriented systems, and is now focused on modeling and model-based standards.- Overview :...
standard for exchanging metadata - Common Warehouse MetamodelCommon Warehouse MetamodelThe Common Warehouse Metamodel defines a specification for modeling metadata for relational, non-relational, multi-dimensional, and most other objects found in a data warehousing environment...
(CMW) - OMGObject Management GroupObject Management Group is a consortium, originally aimed at setting standards for distributed object-oriented systems, and is now focused on modeling and model-based standards.- Overview :...
standard for data warehouse metadata - Topic maps - an ISOInternational Organization for StandardizationThe International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...
standard for the representation and interchange of knowledge, with an emphasis on the findabilityFindabilityFindability is a term for the ease with which information contained on a website can be found, both from outside the website and by users already on the website. Although findability has relevance outside the World Wide Web, it is usually used in the context of the web...
of information. - KM3KM3KM3 or Kernel Meta Meta Model is a neutral language to write metamodels and to define Domain Specific Languages. KM3 has been defined at INRIA and is available under the Eclipse platform.- References :...
or Kernel Meta Meta Model as used in the Metamodel Zoos. The AtlanticZoo is an open source library of more than 100 metamodels under EPL License. KM3 is a simple Domain Specific Language for specifying metamodels. A number of transformations are available to translate from KM3 to other notations like XMI.
See also
- Data governanceData governanceData governance is an emerging discipline with an evolving definition. The discipline embodies a convergence of data quality, data management, data policies, business process management, and risk management surrounding the handling of data in an organization...
- metadataMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
- Semantic webSemantic WebThe Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
- Semantic technologySemantic technologyIn software, semantic technology encodes meanings separately from data and content files, and separately from application code.This enables machines as well as people to understand, share and reason with them at execution time...
- Metadata registryMetadata registryA metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...
- ISO/IEC 11179ISO/IEC 11179ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...
- Topic Maps
External links
- MetaQuery examples provided by Ambient Webs LLC
- SWED portal provided by WordPressHelp
- Microsoft Metadata Publishing Example