Dereferenceable Uniform Resource Identifier
Encyclopedia
A dereferenceable Uniform Resource Identifier or dereferenceable URI is a resource retrieval mechanism that uses any of the internet protocols (e.g. HTTP
) to obtain a copy or representation of the resource it identifies.
In the context of traditional HTML
web pages, this is the normal and obvious way of working: A URI refers to the page, and when requested the web server returns a copy of it. In other non-dereferenceable contexts, such as XML Schema
, the namespace identifier is still a URI, but this is simply an identifier (i.e. a namespace name). There is no intention that this can or should be dereferenced. There is even a separate attribute,
In the case of Linked Data
, the representation takes the form of a document (typically HTML or XML
) that describes the resource that the URI
identifies. In either case, the mechanism makes it possible for a user (or software agent) to "follow your nose" to find out more information related to the identified resource.
. In a totally distributed system, such as the World Wide Web
, a URI
is used to globally identify a thing in the world. Unfortunately, because the architecture and decision is made for HTTP, URIs often identify the web page
s instead of the underlying thing. To remove this confusion, URIs that identify things often include a hash (see the following section). The following example shows the difference of a URL of a person (which usually means his/her homepage) and a URI of a person:
Because of the nature of a URI, it can be dereferenced to get the information of the thing it represents—hence the term dereferenceable URI. SSN
and a person's name are not dereferenceable because, even though you could google these strings, it is not guaranteed that the information exists and is unambiguous. In other words, there is no canonical way of dereferencing those identifiers. On the other hand, URIs can be dereferenced by standardized protocol such as HTTP.
Dereferenceable URIs are based on the well-established theory and practices of "data access by reference
". A data access and manipulation mechanism is used extensively in general computer programming (e.g., C/C++ pointers) and database
call level interfaces (e.g., ODBC
and JDBC
) amongst others. The term: dereferencing describes the act of obtaining a representation of a description of an entity via its URI.
In the Semantic Web
realm, dereferenceable URIs offer the critical fabric that drive the Giant Global Graph
of interconnected data popularly referred to as Linked Data
, another term coined by Tim Berners-Lee
in his Linked Data Design Note and furthered by other articles such as "Cool URIs for the Semantic Web" by Sauermann and Cyganiak.
Eventually everything will have its dereferenceable URI, but things that already have URIs and described in interoperable way at this moment are:
(or pointer) function.
Hypertext Transfer Protocol
The Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....
) to obtain a copy or representation of the resource it identifies.
In the context of traditional HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
web pages, this is the normal and obvious way of working: A URI refers to the page, and when requested the web server returns a copy of it. In other non-dereferenceable contexts, such as XML Schema
XML Schema
XML Schema, published as a W3C recommendation in May 2001, is one of several XML schema languages. It was the first separate schema language for XML to achieve Recommendation status by the W3C...
, the namespace identifier is still a URI, but this is simply an identifier (i.e. a namespace name). There is no intention that this can or should be dereferenced. There is even a separate attribute,
schemaLocation
, which may contain a dereferenceable URI that does point to a copy of the schema document.In the case of Linked Data
Linked Data
In computing, linked data describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a...
, the representation takes the form of a document (typically HTML or XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
) that describes the resource that the URI
Uniform Resource Identifier
In computing, a uniform resource identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network using specific protocols...
identifies. In either case, the mechanism makes it possible for a user (or software agent) to "follow your nose" to find out more information related to the identified resource.
Background
In computing, identifiers are used to distinguish things and to facilitate data exchange. For example, two U.S. citizens of the same name would have different SSNSocial Security number
In the United States, a Social Security number is a nine-digit number issued to U.S. citizens, permanent residents, and temporary residents under section 205 of the Social Security Act, codified as . The number is issued to an individual by the Social Security Administration, an independent...
. In a totally distributed system, such as the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
, a URI
Úri
Úriis a village and commune in the comitatus of Pest in Hungary....
is used to globally identify a thing in the world. Unfortunately, because the architecture and decision is made for HTTP, URIs often identify the web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...
s instead of the underlying thing. To remove this confusion, URIs that identify things often include a hash (see the following section). The following example shows the difference of a URL of a person (which usually means his/her homepage) and a URI of a person:
- Dan ConnollyDan ConnollyDan Connolly received a B.S. in Computer Science from the University of Texas at Austin in 1990. His research interests include investigating the value of formal descriptions of chaotic systems like the Web, particularly in the consensus-building process, and the Semantic Web.He became involved...
's URL is "http://www.w3.org/People/Connolly/". It identifies his homepage, which was created in 1994. If computer A asks computer B "How old is http://www.w3.org/People/Connolly/"?Computer B might answer "16" (in the year 2010).
- Dan ConnollyDan ConnollyDan Connolly received a B.S. in Computer Science from the University of Texas at Austin in 1990. His research interests include investigating the value of formal descriptions of chaotic systems like the Web, particularly in the consensus-building process, and the Semantic Web.He became involved...
's URI is "http://www.w3.org/People/Connolly/#me". It identifies himself, a person. If computer A asks computer B "How old is http://www.w3.org/People/Connolly/#me". Computer B might answer "35".
Because of the nature of a URI, it can be dereferenced to get the information of the thing it represents—hence the term dereferenceable URI. SSN
Social Security number
In the United States, a Social Security number is a nine-digit number issued to U.S. citizens, permanent residents, and temporary residents under section 205 of the Social Security Act, codified as . The number is issued to an individual by the Social Security Administration, an independent...
and a person's name are not dereferenceable because, even though you could google these strings, it is not guaranteed that the information exists and is unambiguous. In other words, there is no canonical way of dereferencing those identifiers. On the other hand, URIs can be dereferenced by standardized protocol such as HTTP.
Dereferenceable URIs are based on the well-established theory and practices of "data access by reference
Reference (computer science)
In computer science, a reference is a value that enables a program to indirectly access a particular data item, such as a variable or a record, in the computer's memory or in some other storage device. The reference is said to refer to the data item, and accessing those data is called...
". A data access and manipulation mechanism is used extensively in general computer programming (e.g., C/C++ pointers) and database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
call level interfaces (e.g., ODBC
Open Database Connectivity
In computing, ODBC is a standard C interface for accessing database management systems . The designers of ODBC aimed to make it independent of database systems and operating systems...
and JDBC
Java Database Connectivity
Java DataBase Connectivity, commonly referred to as JDBC, is an API for the Java programming language that defines how a client may access a database. It provides methods for querying and updating data in a database. JDBC is oriented towards relational databases...
) amongst others. The term: dereferencing describes the act of obtaining a representation of a description of an entity via its URI.
In the Semantic Web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
realm, dereferenceable URIs offer the critical fabric that drive the Giant Global Graph
Giant Global Graph
Giant Global Graph is a name coined by the inventor of the World Wide Web, Tim Berners-Lee in 2007, to help distinguish between the nature and significance of the content on the existing World Wide Web, and that of the next-generation web, or "Web 3.0"...
of interconnected data popularly referred to as Linked Data
Linked Data
In computing, linked data describes a method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a...
, another term coined by Tim Berners-Lee
Tim Berners-Lee
Sir Timothy John "Tim" Berners-Lee, , also known as "TimBL", is a British computer scientist, MIT professor and the inventor of the World Wide Web...
in his Linked Data Design Note and furthered by other articles such as "Cool URIs for the Semantic Web" by Sauermann and Cyganiak.
Eventually everything will have its dereferenceable URI, but things that already have URIs and described in interoperable way at this moment are:
- People – defined in the FOAFFOAF (software)FOAF is a machine-readable ontology describing persons, their activities and their relations to other people and objects. Anyone can use FOAF to describe him or herself...
vocabulary. For example, Tim Berners-LeeTim Berners-LeeSir Timothy John "Tim" Berners-Lee, , also known as "TimBL", is a British computer scientist, MIT professor and the inventor of the World Wide Web...
has the URI http://www.w3.org/People/Berners-Lee/card#i. - Organization - defined in the FOAF vocabulary. For example, W3C has the URI "http://www.w3.org/data#W3C".
- Software project - defined in the DOAPDOAPDescription of a Project is an RDF schema and XML vocabulary to describe software projects, and in particular open-source. It was created and initially developed by Edd Dumbill to convey semantically information associated with open-source software projects...
vocabulary. For example, TabulatorTabulatorNot to be confused with Tabulating machine.The Tabulator is a generic data browser and editor. Using outline and table modes, it provides a way to browse RDF/Linked Data on the web. RDF is the standard for inter-application data exchange....
has the URI "http://dig.csail.mit.edu/2005/ajar/ajaw/data#Tabulator".
Formats
Dereferenceable URIs are constructed using one of two forms: Hash or a Slash. The critical thing about either format is the underlying use of existing Web architecture to preserve the implicit identityDigital identity
Digital identity is the aspect of digital technology that is concerned with the mediation of people's experience of their own identity and the identity of other people and things...
(or pointer) function.
Summary
In summary we can establish the following facts:- A dereferenceable URI is a kind of Uniform Resource IdentifierUniform Resource IdentifierIn computing, a uniform resource identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network using specific protocols...
(but is accessible via HTTPHypertext Transfer ProtocolThe Hypertext Transfer Protocol is a networking protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web....
). - A dereferenceable URI is a kind of referenceReference (computer science)In computer science, a reference is a value that enables a program to indirectly access a particular data item, such as a variable or a record, in the computer's memory or in some other storage device. The reference is said to refer to the data item, and accessing those data is called...
(as found in existing computer scienceComputer scienceComputer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
theory and practice).