SKOS
Encyclopedia
Simple Knowledge Organization System (SKOS) is a family of formal language
Formal language
A formal language is a set of words—that is, finite strings of letters, symbols, or tokens that are defined in the language. The set from which these letters are taken is the alphabet over which the language is defined. A formal language is often defined by means of a formal grammar...

s designed for representation of thesauri
Thesaurus
A thesaurus is a reference work that lists words grouped together according to similarity of meaning , in contrast to a dictionary, which contains definitions and pronunciations...

, classification scheme
Classification scheme
In metadata a classification scheme is a hierarchical arrangement of kinds of things or groups of kinds of things. Typically it is accompanied by descriptive information of the classes or groups. A classification scheme is intended to be used for an arrangement or division of individual objects...

s, taxonomies
Taxonomy
Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

, subject-heading systems
Authority control
Authority control is the practice of creating and maintaining index terms for bibliographic material in a catalog in library and information science. Authority control fulfills two important functions. First, it enables catalogers to disambiguate items with similar or identical headings...

, or any other type of structured controlled vocabulary
Controlled vocabulary
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...

. SKOS is built upon RDF
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

 and RDFS
RDF Schema
RDF Schema is a set of classes with certain properties using the RDF extensible knowledge representation language, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources...

, and its main objective is to enable easy publication of controlled structured vocabularies for the Semantic Web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

. SKOS is currently developed within the W3C
World Wide Web Consortium
The World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...

 framework.

DESIRE II project (1997-2000)

The most direct ancestor to SKOS was the RDF Thesaurus work undertaken in the second phase of the EU DESIRE project . Motivated by the need to improve the user interface and usability of multi-service browsing and searching, a basic RDF vocabulary for Thesauri was produced. As noted later in the SWAD-Europe workplan, the DESIRE work was adopted and further developed in the SOSIG and LIMBER projects. A version of the DESIRE/SOSIG implementation was described in W3C's QL'98 workshop, motivating early work on RDF rule and query languages: A Query and Inference Service for RDF.

LIMBER (1999-2001)

SKOS built upon the output of the Language Independent Metadata Browsing of European Resources (LIMBER) project funded by the European Community, and part of the Information Society Technologies programme. In the LIMBER project CCLRC further developed an RDF
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...

  thesaurus interchange format which was demonstrated on the European Language Social Science Thesaurus (ELSST) at the UK Data Archive
UK Data Archive
The UK Data Archive is a national centre of expertise in data archiving in the United Kingdom . It houses the largest collection of digital data in the social sciences and humanities in the UK....

 as a multilingual version of the English language Humanities and Social Science Electronic Thesaurus (HASSET) which was planned to be used by the Council of European Social Science Data Archives CESSDA.

SWAD-Europe (2002-2004)

SKOS as a distinct initiative began in the SWAD-Europe project, bringing together partners from both DESIRE, SOSIG (ILRT) and LIMBER (CCLRC) who had worked with earlier versions of the schema. It was developed in the Thesaurus Activity Work Package, in the Semantic Web Advanced Development for Europe (SWAD-Europe) project. SWAD-Europe was funded by the European Community, and part of the Information Society Technologies programme. The project was designed to support W3C's Semantic Web Activity through research, demonstrators and outreach efforts conducted by the five project partners, ERCIM, the ILRT at Bristol University, HP Labs
HP Labs
HP Labs is the exploratory and advanced research group for Hewlett-Packard. The lab has some 600 researchersin seven locations throughout the world....

, CCLRC and Stilo.
The first release of SKOS Core and SKOS Mapping were published at the end of 2003, along with other deliverables on RDF encoding of multilingual thesauri and thesaurus mapping.

Semantic web activity (2004-2005)

Following the termination of SWAD-Europe, SKOS effort was supported by the W3C Semantic Web Activity in the framework of the Best Practice and Deployment Working Group. During this period, focus was put both on consolidation of SKOS Core, and development of practical guidelines for porting and publishing thesauri for the Semantic Web.

Later status and roadmap (2006-2008)

SKOS is a work in progress, and the main published documents — the SKOS Core Guide, the SKOS Core Vocabulary Specification, and the Quick Guide to Publishing a Thesaurus on the Semantic Web — have W3C Working Draft status. The main editors of SKOS are Alistair Miles and Dan Brickley.

The new Semantic Web Deployment Working Group, chartered for two years (May 2006 - April 2008), has put in its charter to push SKOS forward on the W3C Recommendation
W3C recommendation
A W3C Recommendation is the final stage of a ratification process of the World Wide Web Consortium working group concerning a technical standard. This designation signifies that a document has been subjected to a public and W3C-member organization's review. It aims to standardise the Web technology...

 track. The roadmap projects SKOS as a Candidate Recommendation by the end of 2007, and as a Proposed Recommendation in the first quarter of 2008. The main issues to solve are determining its precise scope of use, and its articulation with other RDF languages and standards used in libraries (such as Dublin Core
Dublin Core
The Dublin Core metadata terms are a set of vocabulary terms which can be used to describe resources for the purposes of discovery. The terms can be used to describe a full range of web resources: video, images, web pages etc and physical resources such as books and objects like artworks...

).

(2009-08-18)

On this date, W3C announced a new standard that builds a bridge between the world of knowledge organization systems - including thesauri, classifications, subject headings, taxonomies, and folksonomies
Folksonomy
A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content; this practice is also known as collaborative tagging, social classification, social indexing, and social tagging...

 - and the linked data
Linked Data
In computing, linked data describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a...

 community, bringing benefits to both. Libraries, museums, newspapers, government portals, enterprises, social networking applications, and other communities that manage large collections of books, historical artifacts, news reports, business glossaries, blog entries, and other items can now use SKOS to leverage the power of linked data.

Community and participation

All development work is carried out via the mailing list which is a completely open and publicly archived mailing list devoted to discussion of issues relating to knowledge organisation systems, information retrieval and the Semantic Web. Anyone may participate informally in the development of SKOS by joining the discussions on public-esw-thes@w3.org - informal participation is warmly welcomed. Anyone who works for a W3C member organisation may formally participate in the development process by joining the Semantic Web Deployment Working Group - this entitles individuals to edit specifications and to vote on publication decisions.

Components

SKOS is designed as a modular and extensible family of languages, and in a way that its use and implementation should be as simple as possible.

SKOS Core

SKOS Core defines the classes and properties sufficient to represent the common features found in a standard thesaurus. It is based on a concept-centric view of the vocabulary, where primitive objects are not terms, but abstract notions represented by terms. Each SKOS concept is defined as an RDF resource
Resource (Web)
The concept of resource is primitive in the Web architecture, and is used in the definition of its fundamental elements. The term was first introduced to refer to targets of Uniform Resource Locators , but its definition has been further extended to include the referent of any Uniform Resource...

. Each concept can have RDF properties attached, including:
  • one or more preferred index terms (at most one in each natural language)
  • alternative terms or synonym
    Synonym
    Synonyms are different words with almost identical or similar meanings. Words that are synonyms are said to be synonymous, and the state of being a synonym is called synonymy. The word comes from Ancient Greek syn and onoma . The words car and automobile are synonyms...

    s
  • definitions and notes, with specification of their language


Concepts can be organized in hierarchies
Hierarchy
A hierarchy is an arrangement of items in which the items are represented as being "above," "below," or "at the same level as" one another...

 using broader-narrower relationships, or linked by non-hierarchical (associative) relationships.
Concepts can be gathered in concept schemes, to provide consistent and structured sets of concepts, representing whole or part of a controlled vocabulary.

These features represent the stable part of SKOS Core. Other elements of the vocabulary are still considered unstable.

SKOS Mapping

SKOS Mapping is intended to provide a vocabulary to express matching (exact or fuzzy) of concepts from one concept scheme to another. This part of SKOS has been developed in the SWAD-Europe project and currently has no official home. It is maintained informally by SKOS editors.

SKOS Extensions

SKOS Extensions are intended to provide ways to declare relationships between concepts with more specific semantics than the simple "broader-narrower", such as class-instance or partitive relationships. Like SKOS Mapping, this part is likely to stay in standby mode until SKOS Core is completed as a W3C Recommendation.

Metamodel

The ISO metamodel is in some way related to ISO 25964 - Thesauri for Information Retrieval

Applications

  • Some important vocabularies have been migrated into SKOS format and are available in the public domain, including AGROVOC
    AGROVOC
    AGROVOC was first developed in the 1980s as a multilingual structured thesaurus for all subject fields in agriculture, forestry, fisheries, food and related domains . Its main purpose was to standardize the indexing process for the AGRIS database in order to make searching simpler and more...

     and GEMET. Library of Congress Subject Headings
    Library of Congress Subject Headings
    The Library of Congress Subject Headings comprise a thesaurus of subject headings, maintained by the United States Library of Congress, for use in bibliographic records...

     (LCSH) also support the SKOS format.
  • SKOS has been used as the language for the thesauri used in the SWED Environmental Directory developed in the SWAD-Europe project framework.
  • A way to convert thesauri to SKOS, with examples including the MeSH
    Medical Subject Headings
    Medical Subject Headings is a comprehensive controlled vocabulary for the purpose of indexing journal articles and books in the life sciences; it can also serve as a thesaurus that facilitates searching...

     thesaurus, has been outlined by the Vrije Universiteit Amsterdam.
  • Subject classification using DITA
    Darwin Information Typing Architecture
    The Darwin Information Typing Architecture is an OASIS standard XML data model for authoring and publishing. Many third party tools support authoring, including Adobe FrameMaker, XMetaL, Arbortext, Quark XML Author, Oxygen XML Editor, easyDITA, and SDL Xopus...

     and SKOS has been developed by IBM
    IBM
    International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...

    .
  • SKOS is used to represent geographical feature types in the GeoNames
    GeoNames
    GeoNames is a geographical database available and accessible through various Web services, under a Creative Commons attribution license.- Database and web services :...

     ontology.

Tools

  • ThManager is a Java
    Java (programming language)
    Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...

      open-source
    Open-source software
    Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...

     application for creating and visualizing SKOS vocabularies.
  • The W3C provides an experimental on-line validation service.
  • SKOS files can also be imported and edited in RDF-OWL editors such as Protégé
    Protege (software)
    Protégé is a free, open source ontology editor and a knowledge acquisition system. Like Eclipse, Protégé is a framework for which various other projects suggest plugins. This application is written in Java and heavily uses Swing to create the rather complex user interface...

     or SWOOP
    Swoop
    As a verb, swoop may mean:* Attack swiftly in a military offensive* Fly or glide downward, as in**Fixed-wing aircraft flight**Bird flight***In Australia, a bird that attacks a person to defend its nest is said to swoop the person**Bat flight...

     developed by Maryland Information and Network Dynamics Lab Semantic Web Agents Project Mindswap.
  • SKOS synonyms can be transformed from WordNet
    WordNet
    WordNet is a lexical database for the English language. It groups English words into sets of synonyms called synsets, provides short, general definitions, and records the various semantic relations between these synonym sets...

     RDF format using an XSLT
    XSLT
    XSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,...

     stylesheet see W3C RDF
  • PoolParty is a commercial-quality thesaurus management system and a SKOS editor for the Semantic Web including text analysis functionalities and Linked Data
    Linked Data
    In computing, linked data describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a...

     capabilities.
  • SKOSEd is an open source plugin for the Protégé 4 OWL
    Web Ontology Language
    The Web Ontology Language is a family of knowledge representation languages for authoring ontologies.The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web...

     ontology editor that supports authoring SKOS vocabularies. SKOSEd has an accompanying SKOS API written in Java that can be used to build SKOS based applications.
  • Model Futures SKOS Exporter for Microsoft Excel
    Microsoft Excel
    Microsoft Excel is a proprietary commercial spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications...

     allows simple vocabularies to be developed as indented Excel spreadsheets and exported as SKOS RDF. BETA VERSION.
  • Lexaurus is an enterprise thesaurus management system and multi-format editor. Its extensive API includes full revision management. SKOS is one of its many supported formats.
  • TopBraid Enterprise Vocabulary Net (EVN) is a web-based solution for simplified development and management of interconnected controlled vocabularies. It supports collaboration on defining and linking enterprise vocabularies, taxonomies, thesauri and ontologies used for information integration, customization and search.

Data

There are publicly available SKOS data sources.
  • SKOS DataZone wiki The W3C recommends to use this list of publicly available SKOS data sources in a wiki. Most data found there can be used for commercial and research applications.

SKOS and Thesaurus standards

SKOS development has involved experts from both RDF and library community, and SKOS intends to allow easy migration of thesauri defined by standards such as NISO
Niso
Niso is a genus of very small parasitic sea snails, marine gastropod mollusks or micromollusks in the family Eulimidae. -Species:According to the World Register of Marine Species the following species with accepted names are included within the genus Niso * Niso aeglees Bush, 1885* Niso albida...

 Z39.19 - 2005 or ISO 5964:1985.

SKOS and other semantic web standards

SKOS is intended to provide a way to make a legacy of concept schemes available to Semantic Web applications, simpler than the more complex ontology language, OWL
Web Ontology Language
The Web Ontology Language is a family of knowledge representation languages for authoring ontologies.The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web...

. OWL is intended to express complex conceptual structures, which can be used to generate rich metadata and support inference tools. However, constructing useful web ontologies is demanding in terms of expertise, effort, and cost. In many cases, this type of effort might be superfluous or unsuited to requirements, and SKOS might be a better choice. The extensibility of RDF makes possible further incorporation or extension of SKOS vocabularies into more complex vocabularies, including OWL ontologies.

See also

  • Glossary
    Glossary
    A glossary, also known as an idioticon, vocabulary, or clavis, is an alphabetical list of terms in a particular domain of knowledge with the definitions for those terms...

  • Knowledge representation
    Knowledge representation
    Knowledge representation is an area of artificial intelligence research aimed at representing knowledge in symbols to facilitate inferencing from those knowledge elements, creating new elements of knowledge...

  • Metadata
    Metadata
    The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

  • Metadata registry
    Metadata registry
    A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...

  • W3C
  • Taxonomy
    Taxonomy
    Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

  • ISO/CD 25964
    ISO/CD 25964
    ISO/CD 25964-1 Information and documentation - Thesauri and interoperability with other vocabularies: Part 1 Thesauri for information retrieval is a draft International Standard based on BS 8723:2005 Structured vocabularies for information retrieval...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK