Microdata (HTML5)
Encyclopedia
Microdata is a WHATWG HTML
specification used to nest semantics
within existing content on web pages. Search engines, web crawlers, and browsers
can extract and process Microdata from a web page and use it to provide a richer browsing experience for users. Microdata use a supporting vocabulary to describe an item and name-value pairs to assign values to its properties. Microdata helps technologies such as search engines and web crawlers better understand what information is contained in a web page, providing better search results
. Microdata is an attempt to provide a simpler way of annotating HTML element
s with machine readable tags than the similar approaches of using RDFa
and Microformats
.
, or meaning of an Item. Web developers can design a custom vocabulary or use vocabularies available on the web. A collection of commonly used (and Google Supported) Microdata vocabularies located at http://data-vocabulary.org which include: Person, Event, Organization, Product, Review, Review-aggregate, Breadcrumb, Offer, Offer-aggregate. Other markup vocabularies are provided by Schema.org
schemas. Major search engines rely on this markup to improve search results. For some purposes, an ad-hoc vocabulary is adequate. For others, a vocabulary will need to be designed. Where possible, authors are encouraged to re-use existing vocabularies, as this makes content re-use easier.
Here is the same markup with added Microdata:
As the above example shows, Microdata items can be nested. In this case an item of type http://data-vocabulary.org/Address is nested inside an item of type http://data-vocabulary.org/Person.
The following text shows how Google parses the Microdata from the above example code. Developers can test pages containing Microdata using Google's Rich Snippet Testing Tool.
Item
Type: http://data-vocabulary.org/Person
name = John Doe
title = graduate research assistant
affiliation = University of Dreams
nickname = Johnny
url = http://www.johnnyd.com/
address = Item(1)
Item 1
Type: http://data-vocabulary.org/Address
street-address = 1234 Peach Drive
locality = Warner Robins
region = Georgia
can use microdata in its result pages
.
Currently, no stable release of a browser supports the Microdata DOM
API, but the upcoming Opera
12 does.
MicrodataJS is a JavaScript
library and jQuery
plugin that emulates the DOM API.
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
specification used to nest semantics
Semantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
within existing content on web pages. Search engines, web crawlers, and browsers
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...
can extract and process Microdata from a web page and use it to provide a richer browsing experience for users. Microdata use a supporting vocabulary to describe an item and name-value pairs to assign values to its properties. Microdata helps technologies such as search engines and web crawlers better understand what information is contained in a web page, providing better search results
Search engine results page
A search engine results page , is the listing of web pages returned by a search engine in response to a keyword query. The results normally include a list of web pages with titles, a link to the page, and a short description showing where the Keywords have matched content within the page...
. Microdata is an attempt to provide a simpler way of annotating HTML element
HTML element
An HTML element is an individual component of an HTML document. HTML documents are composed of a tree of HTML elements and other nodes, such as text nodes. Each element can have attributes specified. Elements can also have content, including other elements and text. HTML elements represent...
s with machine readable tags than the similar approaches of using RDFa
RDFa
RDFa is a W3C Recommendation that adds a set of attribute-level extensions to XHTML for embedding rich metadata within Web documents...
and Microformats
Microformats
A microformat is a web-based approach to semantic markup which seeks to re-use existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support HTML, such as RSS...
.
Microdata Vocabularies
Microdata vocabularies provide the semanticsSemantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
, or meaning of an Item. Web developers can design a custom vocabulary or use vocabularies available on the web. A collection of commonly used (and Google Supported) Microdata vocabularies located at http://data-vocabulary.org which include: Person, Event, Organization, Product, Review, Review-aggregate, Breadcrumb, Offer, Offer-aggregate. Other markup vocabularies are provided by Schema.org
Schema.org
Schema.org is an initiative launched on 2 June 2011 by Bing, Google and Yahoo! to introduce the concept of the Semantic Web to websites. On 1 November Yandex joined the initiative . The operators of the world's largest search engines propose to mark up website content as metadata about itself,...
schemas. Major search engines rely on this markup to improve search results. For some purposes, an ad-hoc vocabulary is adequate. For others, a vocabulary will need to be designed. Where possible, authors are encouraged to re-use existing vocabularies, as this makes content re-use easier.
Microdata Global Attributes
-
itemscope
– Creates the Item and indicates that descendants of this elementHTML elementAn HTML element is an individual component of an HTML document. HTML documents are composed of a tree of HTML elements and other nodes, such as text nodes. Each element can have attributes specified. Elements can also have content, including other elements and text. HTML elements represent...
contain information about it. -
itemtype
– A valid URL of a vocabulary that describes the item and its properties context. -
itemid
– Indicates a unique identifier of the item. -
itemprop
– Indicates that its containing tag holds the value of the specified item property. The properties name and value context are described by the items vocabulary. Properties values usually consist of string values, but can also use URLs using the a element and itshref
attribute, theimg
element and itssrc
attribute, or other elements that link to or embed external resources. -
itemref
– Properties that are not descendants of the element with theitemscope
attribute can be associated with the item using this attribute. Provides a list of id's of elements with additional properties elsewhere in the document.
Example
The following markup may be found on a typical about page containing information about a person:Here is the same markup with added Microdata:
As the above example shows, Microdata items can be nested. In this case an item of type http://data-vocabulary.org/Address is nested inside an item of type http://data-vocabulary.org/Person.
The following text shows how Google parses the Microdata from the above example code. Developers can test pages containing Microdata using Google's Rich Snippet Testing Tool.
Item
Type: http://data-vocabulary.org/Person
name = John Doe
title = graduate research assistant
affiliation = University of Dreams
nickname = Johnny
url = http://www.johnnyd.com/
address = Item(1)
Item 1
Type: http://data-vocabulary.org/Address
street-address = 1234 Peach Drive
locality = Warner Robins
region = Georgia
Support
GoogleGoogle
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...
can use microdata in its result pages
Search engine results page
A search engine results page , is the listing of web pages returned by a search engine in response to a keyword query. The results normally include a list of web pages with titles, a link to the page, and a short description showing where the Keywords have matched content within the page...
.
Currently, no stable release of a browser supports the Microdata DOM
Document Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...
API, but the upcoming Opera
Opera (web browser)
Opera is a web browser and Internet suite developed by Opera Software with over 200 million users worldwide. The browser handles common Internet-related tasks such as displaying web sites, sending and receiving e-mail messages, managing contacts, chatting on IRC, downloading files via BitTorrent,...
12 does.
MicrodataJS is a JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....
library and jQuery
JQuery
jQuery is a cross-browser JavaScript library designed to simplify the client-side scripting of HTML. It was released in January 2006 at BarCamp NYC by John Resig...
plugin that emulates the DOM API.
External links
, about how some of the design decisions for microdata were made- Live Microdata, a tool to interactively edit and extract the Microdata embedded in HTML