XHTML
Encyclopedia
XHTML is a family of XML
markup language
s that mirror or extend versions of the widely-used Hypertext Markup Language (HTML), the language in which web page
s are written.
While HTML (prior to HTML5) was defined as an application of Standard Generalized Markup Language
(SGML), a very flexible markup language framework, XHTML is an application of XML
, a more restrictive subset of SGML. Because XHTML documents need to be well-formed, they can be parsed using standard XML parsers—unlike HTML, which requires a lenient HTML-specific parser
.
XHTML 1.0 became a World Wide Web Consortium
(W3C) Recommendation
on January 26, 2000. XHTML 1.1 became a W3C Recommendation on May 31, 2001. XHTML5 is undergoing development as of September 2009, as part of the HTML5 specification.
(W3C) also continues to maintain the HTML 4.01 Recommendation, and the specifications for HTML5 and XHTML5 are being actively developed. In the current XHTML 1.0 Recommendation document, as published and revised to August 2002, the W3C commented that, "The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility."
However, in 2004, the Web Hypertext Application Technology Working Group
(WHATWG) formed, independently of the W3C, to work on advancing ordinary HTML not based on XHTML. The WHATWG eventually began working on a standard that supported both XML and non-XML serialization
s, HTML5, in parallel to W3C standards such as XHTML 2. In 2007, the W3C's HTML working group voted to officially recognize HTML5 and work on it as the next-generated HTML standard. In 2009, the W3C allowed the XHTML 2 Working Group's charter to expire, acknowledging that HTML5 would be the sole next-generation HTML standard, including both XML and non-XML serializations. Of the two serializations, the W3C suggests that most authors use the HTML syntax, rather than the XHTML syntax.
with other data formats. HTML 4 was ostensibly an application of Standard Generalized Markup Language
(SGML); however the specification for SGML was complex, and neither web browsers nor the HTML 4 Recommendation were fully conformant to it. The XML standard, approved in 1998, provided a simpler data format closer in simplicity to HTML 4. By shifting to an XML format, it was hoped HTML would become compatible with common XML tools; servers and proxies would be able to transform content, as necessary, for constrained devices such as mobile phones.
By utilizing namespaces
, XHTML documents could provide extensibility by including fragments from other XML-based languages such as Scalable Vector Graphics
and MathML
. Finally, the renewed work would provide an opportunity to divide HTML into reusable components (XHTML Modularization
) and clean up untidy parts of the language.
is a tree structure that represents the page internally in applications, and XHTML and HTML are two different ways of representing that in markup (serializations). Both are less expressive than the DOM (for example, "--" may be placed in comments in the DOM, but cannot be represented in a comment in either XHTML or HTML), and generally XHTML's XML syntax is a little more expressive than HTML (for example, arbitrary namespaces are not allowed in HTML). So, firstly one source of differences is immediate: XHTML uses an XML syntax, while HTML uses a pseudo-SGML syntax (officially SGML for HTML 4 and under, but never in practice, and standardised away from SGML in HTML5). Secondly however, because the expressible contents of the DOM in syntax are slightly different, there are some changes in actual behavior between the two models.
Firstly then, syntax differences:
Secondly, in contrast to these minor syntactical differences, there are some behavioral differences, mostly arising from the underlying differences in serialization. For example:
Such "HTML-compatible" content is sent using the HTML media type (
Most web browsers have mature support for all of the possible XHTML media types. The notable exception is Internet Explorer
versions 8 and earlier by Microsoft
; rather than rendering
added support for true XHTML in IE9
.
As long as support is not widespread, most web developers avoid using XHTML that is not HTML-compatible, so advantages of XML such as namespaces, faster parsing and smaller-footprint browsers do not benefit the user.
. They went on to describe the benefits of XML-based Web documents (i.e. XHTML) regarding searching, indexing and parsing as well as future-proofing the Web itself.
In October 2006, HTML inventor and W3C chair Tim Berners-Lee
, introducing a major W3C effort to develop new HTML specification, posted in his blog that, "The attempt to get the world to switch to XML … all at once didn't work. The large HTML-generating public did not move … Some large communities did shift and are enjoying the fruits of well-formed systems … The plan is to charter a completely new HTML group." The current HTML5 working draft says "special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability … while at the same time updating the HTML specifications to address issues raised in the past few years." Ian Hickson
, editor of the HTML5 specification criticising the improper use of XHTML in 2002, is a member of the group developing this specification and is listed as one of the co-editors of the current working draft.
Simon Pieters researched the XML-compliance of mobile browsers and concluded “the claim that XHTML would be needed for mobile devices is simply a myth”.
for XHTML 1.0, corresponding to the three different versions of HTML 4.01:
The second edition of XHTML 1.0 became a W3C Recommendation in August 2002.
provides an abstract collection of components through which XHTML can be subsetted and extended. The feature is intended to help XHTML extend its reach onto emerging platforms, such as mobile devices and Web-enabled televisions. The initial draft of Modularization of XHTML became available in April 1999, and reached Recommendation status in April 2001.
The first modular XHTML variants were XHTML 1.1 and XHTML Basic 1.0.
In October 2008 Modularization of XHTML was superseded by XHTML Modularization 1.1, which adds an XML Schema implementation. It was itself superseded by a second edition in July 2010.
elements (
Although XHTML 1.1 is largely compatible with XHTML 1.0 and HTML 4, in August 2002 the Working Group issued a formal Note advising that it should not be transmitted with the HTML media type. With limited browser support for the alternate
A second edition of XHTML 1.1 was issued on 23 November 2010, which addresses various errata and adds an XML Schema implementation not included in the original specification. (It was first released briefly on 7 May 2009 as a "Proposed Edited Recommendation" before being rescinded on 19 May due to unresolved issues.)
s may lack the system resources
to implement all XHTML abstract modules, the W3C defined a feature-limited XHTML specification called XHTML Basic. It provides a minimal feature subset sufficient for the most common content-authoring. The specification became a W3C recommendation
on December 2000.
Of all the versions of XHTML, XHTML Basic 1.0 provides the fewest features. With XHTML 1.1, it is one of the two first implementations of modular XHTML. In addition to the Core Modules (Structure, Text, Hypertext, and List), it implements the following abstract modules: Base, Basic Forms, Basic Tables, Image, Link, Metainformation, Object, Style Sheet, and Target.
XHTML Basic 1.1 replaces the Basic Forms Module with the Forms Module, and adds the Intrinsic Events, Presentation, and Scripting modules. It also supports additional tags and attributes from other modules. This version became a W3C recommendation on 29 July 2008.
The current version of XHTML Basic is 1.1 Second Edition (23 November 2010), in which the language is re-implemented in the W3C's XML Schema language. This version also supports the
In October 2001, a limited company
called the Wireless Application Protocol Forum began adapting XHTML Basic for WAP 2.0, the second major version of the Wireless Application Protocol
. WAP Forum based their DTD on the W3C's Modularization of XHTML, incorporating the same modules the W3C used in XHTML Basic 1.0—except for the Target Module. Starting with this foundation, the WAP Forum replaced the Basic Forms Module with a partial implementation of the Forms Module, added partial support for the Legacy and Presentation modules, and added full support for the Style Attribute Module.
In 2002, the WAP Forum was subsumed into the Open Mobile Alliance
(OMA), which continued to develop XHTML Mobile Profile as a component of their OMA Browsing Specification.
, which includes the Target Module. Events in this version of the specification are updated to DOM Level 3 specifications (i.e., they are platform- and language-neutral).
and
support through RDFa
. The
targets) might also be present. The XHTML2 WG had not been chartered to carry out the development of XHTML1.2. Since the W3C announced that it does not intend to recharter the XHTML2 WG, and closed the WG in December 2010, this means that XHTML 1.2 proposal would not eventuate.
New features to have been introduced by XHTML 2.0 including:
sites or online shops.
In April 2007, the Mozilla Foundation and Opera Software joined Apple in requesting that the newly rechartered HTML Working Group of the W3C adopt the work, under the name of HTML 5. The group resolved to do this the following month, and the First Public Working Draft of HTML5 was issued by the W3C in January 2008. The most recent W3C Working Draft was published in January 2011.
HTML5 has both a regular
is extended with APIs for editing, drag-and-drop, data storage and network communication.
The language is more compatible with HTML 4 and XHTML 1.x than XHTML 2.0, due to the decision to keep the existing HTML form elements and events model. It adds many new elements not found in XHTML 1.x, however, such as
The most recent draft includes WAI-ARIA support.
is an extended version of the XHTML markup language for supporting RDF
through a collection of attributes and processing rules in the form of well-formed XML documents. This host language is one of the techniques used to develop Semantic Web
content by embedding rich semantic markup.
. In practice, many web development programs provide code validation based on the W3C standards.
. The namespace URI for XHTML is
:
, or DOCTYPE, may be used. A DOCTYPE declares to the browser the Document Type Definition
(DTD) to which the document conforms. A Document Type Declaration should be placed before the root element
.
The system identifier
part of the DOCTYPE, which in these examples is the URL
that begins with http://, need only point to a copy of the DTD to use, if the validator cannot locate one based on the public identifier
(the other quoted string). It does not need to be the specific URL that is in these examples; in fact, authors are encouraged to use local copies of the DTD files when possible. The public identifier, however, must be character-for-character the same as in the examples.
may be specified at the beginning of an XHTML document in the XML declaration when the document is served using the
or UTF-16, unless the encoding has already been determined by a higher protocol.)
For example:
The declaration may be optionally omitted because it declares as its encoding the default encoding. However, if the document instead makes use of XML 1.1 or another character encoding, a declaration is necessary. Internet Explorer
prior to version 7 enters quirks mode
, if it encounters an XML declaration in a document served as
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
markup language
Markup language
A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...
s that mirror or extend versions of the widely-used Hypertext Markup Language (HTML), the language in which web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...
s are written.
While HTML (prior to HTML5) was defined as an application of Standard Generalized Markup Language
Standard Generalized Markup Language
The Standard Generalized Markup Language is an ISO-standard technology for defining generalized markup languages for documents...
(SGML), a very flexible markup language framework, XHTML is an application of XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
, a more restrictive subset of SGML. Because XHTML documents need to be well-formed, they can be parsed using standard XML parsers—unlike HTML, which requires a lenient HTML-specific parser
Parsing
In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens , to determine its grammatical structure with respect to a given formal grammar...
.
XHTML 1.0 became a World Wide Web Consortium
World Wide Web Consortium
The World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...
(W3C) Recommendation
W3C recommendation
A W3C Recommendation is the final stage of a ratification process of the World Wide Web Consortium working group concerning a technical standard. This designation signifies that a document has been subjected to a public and W3C-member organization's review. It aims to standardise the Web technology...
on January 26, 2000. XHTML 1.1 became a W3C Recommendation on May 31, 2001. XHTML5 is undergoing development as of September 2009, as part of the HTML5 specification.
Overview
XHTML 1.0 is "a reformulation of the three HTML 4 document types as applications of XML 1.0". The World Wide Web ConsortiumWorld Wide Web Consortium
The World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...
(W3C) also continues to maintain the HTML 4.01 Recommendation, and the specifications for HTML5 and XHTML5 are being actively developed. In the current XHTML 1.0 Recommendation document, as published and revised to August 2002, the W3C commented that, "The XHTML family is the next step in the evolution of the Internet. By migrating to XHTML today, content developers can enter the XML world with all of its attendant benefits, while still remaining confident in their content's backward and future compatibility."
However, in 2004, the Web Hypertext Application Technology Working Group
Web Hypertext Application Technology Working Group
The Web Hypertext Application Technology Working Group is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple, the Mozilla Foundation and Opera Software in 2004. Since then, the editor of the WHATWG specifications, Ian...
(WHATWG) formed, independently of the W3C, to work on advancing ordinary HTML not based on XHTML. The WHATWG eventually began working on a standard that supported both XML and non-XML serialization
Serialization
In computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...
s, HTML5, in parallel to W3C standards such as XHTML 2. In 2007, the W3C's HTML working group voted to officially recognize HTML5 and work on it as the next-generated HTML standard. In 2009, the W3C allowed the XHTML 2 Working Group's charter to expire, acknowledging that HTML5 would be the sole next-generation HTML standard, including both XML and non-XML serializations. Of the two serializations, the W3C suggests that most authors use the HTML syntax, rather than the XHTML syntax.
Motivation
XHTML was developed to make HTML more extensible and increase interoperabilityInteroperability
Interoperability is a property referring to the ability of diverse systems and organizations to work together . The term is often used in a technical systems engineering sense, or alternatively in a broad sense, taking into account social, political, and organizational factors that impact system to...
with other data formats. HTML 4 was ostensibly an application of Standard Generalized Markup Language
Standard Generalized Markup Language
The Standard Generalized Markup Language is an ISO-standard technology for defining generalized markup languages for documents...
(SGML); however the specification for SGML was complex, and neither web browsers nor the HTML 4 Recommendation were fully conformant to it. The XML standard, approved in 1998, provided a simpler data format closer in simplicity to HTML 4. By shifting to an XML format, it was hoped HTML would become compatible with common XML tools; servers and proxies would be able to transform content, as necessary, for constrained devices such as mobile phones.
By utilizing namespaces
XML Namespace
xmlns tagged XML namespaces are used for providing uniquely named elements and attributes in an XML document. They are defined in a W3C recommendation. An XML instance may contain element or attribute names from more than one XML vocabulary...
, XHTML documents could provide extensibility by including fragments from other XML-based languages such as Scalable Vector Graphics
Scalable Vector Graphics
Scalable Vector Graphics is a family of specifications of an XML-based file format for describing two-dimensional vector graphics, both static and dynamic . The SVG specification is an open standard that has been under development by the World Wide Web Consortium since 1999.SVG images and their...
and MathML
MathML
Mathematical Markup Language is an application of XML for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into World Wide Web pages and other documents...
. Finally, the renewed work would provide an opportunity to divide HTML into reusable components (XHTML Modularization
XHTML Modularization
XHTML modularization is a methodology for producing modularized markup languages in a number of different schema languages so that the modules can easily be plugged together to create markup languages....
) and clean up untidy parts of the language.
Relationship to HTML
There are various differences between XHTML and HTML. The Document Object ModelDocument Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...
is a tree structure that represents the page internally in applications, and XHTML and HTML are two different ways of representing that in markup (serializations). Both are less expressive than the DOM (for example, "--" may be placed in comments in the DOM, but cannot be represented in a comment in either XHTML or HTML), and generally XHTML's XML syntax is a little more expressive than HTML (for example, arbitrary namespaces are not allowed in HTML). So, firstly one source of differences is immediate: XHTML uses an XML syntax, while HTML uses a pseudo-SGML syntax (officially SGML for HTML 4 and under, but never in practice, and standardised away from SGML in HTML5). Secondly however, because the expressible contents of the DOM in syntax are slightly different, there are some changes in actual behavior between the two models.
Firstly then, syntax differences:
- Broadly, the XML rules require that all elementsHTML elementAn HTML element is an individual component of an HTML document. HTML documents are composed of a tree of HTML elements and other nodes, such as text nodes. Each element can have attributes specified. Elements can also have content, including other elements and text. HTML elements represent...
be closed, either by a separate closing tag or using self closing syntax (e.g.<br />
), while HTML syntax permits some elements to be unclosed because either they are always empty (e.g.<input>
) or their end can be determined implicitly ("omissibility", e.g.<p>
). - XML is case-sensitive for element and attributeHTML attributeHTML attributes are modifiers of HTML elements. They generally appear as name-value pairs, separated by "=", and are written within the start tag of an element, after the element's name:The value may be enclosed in single or double quotes, although values consisting of certain characters can be...
names, while HTML is not. - Some shorthand features in HTML are omitted in XML, such as (1) attribute minimization, where attribute values or their quotes may be omitted (e.g.
<option selected>
or<option selected=selected>
, while XML this must be expressed as<option selected="selected">
); (2) element minimization may be used to remove elements entirely (such as<tbody>
inferred in a table if not given); and (3) the rarely used SGML syntax for element minimization ("shorttag"), which most browsers do not implement. - There are numerous other technical requirements surrounding namespaces and precise parsing of whitespace and certain characters and elements. The exact parsing of HTML in practice has been undefined until recently; see the HTML5 specification (
[HTML5 ]) for full details, or the working summary (HTML vs. XHTML).
Secondly, in contrast to these minor syntactical differences, there are some behavioral differences, mostly arising from the underlying differences in serialization. For example:
- Most prominently, behavior on parse errors differ. A fatal parse error in XML (such as an incorrect tag structure) causes document processing to be aborted.
- Most content requiring namespaces will not work in HTML, except the built-in support for SVG and MathML in the HTML5 parser along with certain magic prefixes such as
xlink
. - JavaScript processing is a little different in XHTML, with minor changes in case sensitivity to some functions, and further precautions to restrict processing to well-formed content. Scripts must not use the
document.write
method; it is not available for XHTML. TheinnerHTML
property is available, but will not insert non-well-formed content. On the other hand, it can be used to insert well-formed namespaced content into XHTML. - CSS is also applied slightly differently. Due to XHTML's case-sensitivity, all CSS selectors become case sensitive for XHTML documents. Some CSS properties, such as backgrounds, set on the
<body>
element in HTML are 'inherited upwards' into the<html>
element; this appears not to be the case for XHTML.
Adoption
The similarities between HTML 4.01 and XHTML 1.0 led many web sites and content management systems to adopt the initial W3C XHTML 1.0 Recommendation. To aid authors in the transition, the W3C provided guidance on how to publish XHTML 1.0 documents in an HTML-compatible manner, and serve them to browsers that were not designed for XHTML.Such "HTML-compatible" content is sent using the HTML media type (
text/html
) rather than the official Internet media type for XHTML (application/xhtml+xml
). When measuring the adoption of XHTML to that of regular HTML, therefore, it is important to distinguish whether it is media type usage or actual document contents that is being compared.Most web browsers have mature support for all of the possible XHTML media types. The notable exception is Internet Explorer
Internet Explorer
Windows Internet Explorer is a series of graphical web browsers developed by Microsoft and included as part of the Microsoft Windows line of operating systems, starting in 1995. It was first released as part of the add-on package Plus! for Windows 95 that year...
versions 8 and earlier by Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
; rather than rendering
application/xhtml+xml
content, a dialog box invites the user to save the content to disk instead. Both Internet Explorer 7 (released in 2006) and Internet Explorer 8 (released in March 2009) exhibit this behavior. Microsoft developer Chris Wilson explained in 2005 that IE7’s priorities were improved security and CSS support, and that proper XHTML support would be difficult to graft onto IE’s compatibility-oriented HTML parser; however, MicrosoftMicrosoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
added support for true XHTML in IE9
Internet Explorer 9
Windows Internet Explorer 9 is the current version of the Internet Explorer web browser from Microsoft. It was released to the public on March 14, 2011 at 21:00 PDT. Internet Explorer 9 supports several CSS 3 properties, embedded ICC v2 or v4 color profiles support via Windows Color System, and...
.
As long as support is not widespread, most web developers avoid using XHTML that is not HTML-compatible, so advantages of XML such as namespaces, faster parsing and smaller-footprint browsers do not benefit the user.
Criticism
In the early 2000s, some web developers began to question why Web authors ever made the leap into authoring in XHTML. Others countered that the problems ascribed to the use of XHTML could mostly be attributed to two main sources: the production of invalid XHTML documents by some Web authors and the lack of support for XHTML built into Internet Explorer 6Internet Explorer 6
Internet Explorer 6 is the sixth major revision of Internet Explorer, a web browser developed by Microsoft for Windows operating systems...
. They went on to describe the benefits of XML-based Web documents (i.e. XHTML) regarding searching, indexing and parsing as well as future-proofing the Web itself.
In October 2006, HTML inventor and W3C chair Tim Berners-Lee
Tim Berners-Lee
Sir Timothy John "Tim" Berners-Lee, , also known as "TimBL", is a British computer scientist, MIT professor and the inventor of the World Wide Web...
, introducing a major W3C effort to develop new HTML specification, posted in his blog that, "The attempt to get the world to switch to XML … all at once didn't work. The large HTML-generating public did not move … Some large communities did shift and are enjoying the fruits of well-formed systems … The plan is to charter a completely new HTML group." The current HTML5 working draft says "special attention has been given to defining clear conformance criteria for user agents in an effort to improve interoperability … while at the same time updating the HTML specifications to address issues raised in the past few years." Ian Hickson
Ian Hickson
Ian 'Hixie' Hickson is the author and maintainer of the Acid2 and Acid3 tests, and the Web Applications 1.0/HTML 5 specification., Sean Michael Kerner, internetnews.com, January 25, 2008 He is known as a proponent of web standards, and has played a crucial role in the development of specifications...
, editor of the HTML5 specification criticising the improper use of XHTML in 2002, is a member of the group developing this specification and is listed as one of the co-editors of the current working draft.
Simon Pieters researched the XML-compliance of mobile browsers and concluded “the claim that XHTML would be needed for mobile devices is simply a myth”.
XHTML 1.0
December 1998 saw the publication of a W3C Working Draft entitled Reformulating HTML in XML. This introduced Voyager, the codename for a new markup language based on HTML 4, but adhering to the stricter syntax rules of XML. By February 1999 the name of the specification had changed to XHTML 1.0: The Extensible HyperText Markup Language, and in January 2000 it was officially adopted as a W3C Recommendation. There are three formal DTDsDocument Type Definition
Document Type Definition is a set of markup declarations that define a document type for SGML-family markup languages...
for XHTML 1.0, corresponding to the three different versions of HTML 4.01:
- XHTML 1.0 Strict is the XML equivalent to strict HTML 4.01, and includes elements and attributes that have not been marked deprecated in the HTML 4.01 specification. As of May 25 2011, XHTML 1.0 Strict is the document type used for the homepage of the website of the World Wide Web ConsortiumWorld Wide Web ConsortiumThe World Wide Web Consortium is the main international standards organization for the World Wide Web .Founded and headed by Tim Berners-Lee, the consortium is made up of member organizations which maintain full-time staff for the purpose of working together in the development of standards for the...
. - XHTML 1.0 Transitional is the XML equivalent of HTML 4.01 Transitional, and includes the presentational elements (such as
center
,font
andstrike
) excluded from the strict version. - XHTML 1.0 Frameset is the XML equivalent of HTML 4.01 Frameset, and allows for the definition of frameset documents—a common Web feature in the late 1990s.
The second edition of XHTML 1.0 became a W3C Recommendation in August 2002.
Modularization of XHTML
ModularizationXHTML Modularization
XHTML modularization is a methodology for producing modularized markup languages in a number of different schema languages so that the modules can easily be plugged together to create markup languages....
provides an abstract collection of components through which XHTML can be subsetted and extended. The feature is intended to help XHTML extend its reach onto emerging platforms, such as mobile devices and Web-enabled televisions. The initial draft of Modularization of XHTML became available in April 1999, and reached Recommendation status in April 2001.
The first modular XHTML variants were XHTML 1.1 and XHTML Basic 1.0.
In October 2008 Modularization of XHTML was superseded by XHTML Modularization 1.1, which adds an XML Schema implementation. It was itself superseded by a second edition in July 2010.
XHTML 1.1: Module-based XHTML
XHTML 1.1 evolved out of the work surrounding the initial Modularization of XHTML specification. The W3C released a first draft in September 1999; Recommendation status was reached in May 2001. The modules combined within XHTML 1.1 effectively recreate XHTML 1.0 Strict, with the addition of ruby annotationRuby character
are small, annotative glosses that can be placed above or to the right of a Chinese character when writing languages with logographic characters such as Chinese or Japanese to show the pronunciation...
elements (
ruby
, rbc
, rtc
, rb
, rt
and rp
) to better support East-Asian languages. Other changes include removal of the name
attribute from the a
and map
elements, and (in the first edition of the language) removal of the lang
attribute in favour of xml:lang
.Although XHTML 1.1 is largely compatible with XHTML 1.0 and HTML 4, in August 2002 the Working Group issued a formal Note advising that it should not be transmitted with the HTML media type. With limited browser support for the alternate
application/xhtml+xml
media type, XHTML 1.1 proved unable to gain widespread use. In January 2009 a second edition of the document (XHTML Media Types - Second Edition) was issued, relaxing this restriction and allowing XHTML 1.1 to be served as text/html
.A second edition of XHTML 1.1 was issued on 23 November 2010, which addresses various errata and adds an XML Schema implementation not included in the original specification. (It was first released briefly on 7 May 2009 as a "Proposed Edited Recommendation" before being rescinded on 19 May due to unresolved issues.)
XHTML Basic
Since information applianceInformation appliance
In general terms, an information appliance or information device is any machine or device that is usable for the purposes of computing, telecommunicating, reproducing, and presenting encoded information in myriad forms and applications....
s may lack the system resources
Resource (computer science)
A resource, or system resource, is any physical or virtual component of limited availability within a computer system. Every device connected to a computer system is a resource. Every internal system component is a resource...
to implement all XHTML abstract modules, the W3C defined a feature-limited XHTML specification called XHTML Basic. It provides a minimal feature subset sufficient for the most common content-authoring. The specification became a W3C recommendation
W3C recommendation
A W3C Recommendation is the final stage of a ratification process of the World Wide Web Consortium working group concerning a technical standard. This designation signifies that a document has been subjected to a public and W3C-member organization's review. It aims to standardise the Web technology...
on December 2000.
Of all the versions of XHTML, XHTML Basic 1.0 provides the fewest features. With XHTML 1.1, it is one of the two first implementations of modular XHTML. In addition to the Core Modules (Structure, Text, Hypertext, and List), it implements the following abstract modules: Base, Basic Forms, Basic Tables, Image, Link, Metainformation, Object, Style Sheet, and Target.
XHTML Basic 1.1 replaces the Basic Forms Module with the Forms Module, and adds the Intrinsic Events, Presentation, and Scripting modules. It also supports additional tags and attributes from other modules. This version became a W3C recommendation on 29 July 2008.
The current version of XHTML Basic is 1.1 Second Edition (23 November 2010), in which the language is re-implemented in the W3C's XML Schema language. This version also supports the
lang
attribute.XHTML-Print
XHTML-Print, which became a W3C Recommendation in September 2006, is a specialized version of XHTML Basic designed for documents printed from information appliances to low-end printers.XHTML Mobile Profile
XHTML Mobile Profile (abbreviated XHTML MP or XHTML-MP) is a third-party variant of the W3C's XHTML Basic specification. Like XHTML Basic, XHTML was developed for information appliances with limited system resources.In October 2001, a limited company
Limited company
A limited company is a company in which the liability of the members or subscribers of the company is limited to what they have invested or guaranteed to the company. Limited companies may be limited by shares or by guarantee. And the former of these, a limited company limited by shares, may be...
called the Wireless Application Protocol Forum began adapting XHTML Basic for WAP 2.0, the second major version of the Wireless Application Protocol
Wireless Application Protocol
Wireless Application Protocol is a technical standard for accessing information over a mobile wireless network.A WAP browser is a web browser for mobile devices such as mobile phones that uses the protocol.Before the introduction of WAP, mobile service providers had limited opportunities to offer...
. WAP Forum based their DTD on the W3C's Modularization of XHTML, incorporating the same modules the W3C used in XHTML Basic 1.0—except for the Target Module. Starting with this foundation, the WAP Forum replaced the Basic Forms Module with a partial implementation of the Forms Module, added partial support for the Legacy and Presentation modules, and added full support for the Style Attribute Module.
In 2002, the WAP Forum was subsumed into the Open Mobile Alliance
Open Mobile Alliance
The Open Mobile Alliance is a standards body which develops open standards for the mobile phone industry.- Principles :Mission: To provide interoperable service enablers working across countries, operators and mobile terminals....
(OMA), which continued to develop XHTML Mobile Profile as a component of their OMA Browsing Specification.
XHTML Mobile Profile 1.1
To this version, finalized in 2004, the OMA added partial support for the Scripting Module, and partial support for Intrinsic Events. XHTML MP 1.1 is part of v2.1 of the OMA Browsing Specification (1 November 2002).XHTML Mobile Profile 1.2
This version, finalized 27 February 2007, expands the capabilities of XHTML MP 1.1 with full support for the Forms Module and OMA Text Input Modes. XHTML MP 1.2 is part of v2.3 of the OMA Browsing Specification (13 March 2007).XHTML Mobile Profile 1.3
XHTML MP 1.3 (finalized on 23 September 2008) uses the XHTML Basic 1.1 document type definitionDocument Type Definition
Document Type Definition is a set of markup declarations that define a document type for SGML-family markup languages...
, which includes the Target Module. Events in this version of the specification are updated to DOM Level 3 specifications (i.e., they are platform- and language-neutral).
XHTML 1.2
The XHTML 2 Working Group considered the creation of a new language based on XHTML 1.1. If XHTML 1.2 was created, it would include WAI-ARIAWAI-ARIA
is a draft technical specification published by the World Wide Web Consortium that specifies how to increase the accessibility of dynamic content and user interface components developed with Ajax, HTML, JavaScript and related technologies...
and
role
attributes to better support accessible web applications, and improved Semantic WebSemantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
support through RDFa
RDFa
RDFa is a W3C Recommendation that adds a set of attribute-level extensions to XHTML for embedding rich metadata within Web documents...
. The
inputmode
attribute from XHTML Basic 1.1, along with the target
attribute (for specifying frameFraming (World Wide Web)
When using web browsers, the terms frames or frameset refer to the display of two or more web pages or media elements displayed side-by-side within the same browser window...
targets) might also be present. The XHTML2 WG had not been chartered to carry out the development of XHTML1.2. Since the W3C announced that it does not intend to recharter the XHTML2 WG, and closed the WG in December 2010, this means that XHTML 1.2 proposal would not eventuate.
XHTML 2.0
Between August 2002 and July 2006 the W3C released eight Working Drafts of XHTML 2.0, a new version of XHTML able to make a clean break from the past by discarding the requirement of backward compatibility. This lack of compatibility with XHTML 1.x and HTML 4 caused some early controversy in the web developer community. Some parts of the language (such as therole
and RDFa attributes) were subsequently split out of the specification and worked on as separate modules, partially to help make the transition from XHTML 1.x to XHTML 2.0 smoother. A ninth draft of XHTML 2.0 was expected to appear in 2009, but on July 2, 2009, the W3C decided to let the XHTML2 Working Group charter expire by that year's end, effectively halting any further development of the draft into a standard. Instead, XHTML 2.0 and its related documents were released as W3C Notes.New features to have been introduced by XHTML 2.0 including:
- HTML forms will be replaced by XFormsXFormsXForms is an XML format for the specification of a data processing model for XML data and user interface for the XML data, such as web forms...
, an XML-based user input specification allowing forms to be displayed appropriately for different rendering devices. - HTML frames will be replaced by XFramesXFramesXFrames is an XML format for combining and organizing web based documents together on a single webpage through the use of frames. Similarly to HTML Frames, XFrames can be made useful through its power to create a content frame that is scrollable while other frames - such as sidebar menus, the...
. - The DOM EventsDOM EventsDOM events allow event-driven programming languages like JavaScript, JScript, ECMAScript, VBScript and Java to register various event handlers/listeners on the element nodes inside a DOM tree, e.g. HTML, XHTML, XUL and SVG documents....
will be replaced by XML EventsXML EventsIn computer science and web development, XML Events is a W3C standard for handling events that occur in an XML document. These events are typically caused by users interacting with the web page using a device such as a web browser on a personal computer or mobile phone.- Formal Definition :An XML...
, which uses the XML Document Object ModelDocument Object ModelThe Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...
. - A new list element type, the
nl
element type, will be included to specifically designate a list as a navigation list. This will be useful in creating nested menus, which are currently created by a wide variety of means like nested unordered lists or nested definition lists. - Any element will be able to act as a hyperlinkHyperlinkIn computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks...
, e. g.,
, similar to XLink- Articles
XLinkXML Linking Language, or XLink, is an XML markup language and W3C specification that provides methods for creating internal and external links within XML documents, and associating metadata with those links.-The XLink specification:...
. However, XLink itself is not compatible with XHTML due to design differences. - Any element will be able to reference alternative media with the
attribute, e. g.,src
is the same asLondon Bridge
. - The
alt
attribute of theimg
element has been removed: alternative text will be given in the content of theimg
element, much like theobject
element, e. g.,
.HMS Audacious - A single heading element (
h
) will be added. The level of these headings are determined by the depth of the nesting. This allows the use of headings to be infinite, rather than limiting use to six levels deep. - The remaining presentational elements
i
,b
andtt
, still allowed in XHTML 1.x (even Strict), will be absent from XHTML 2.0. The only somewhat presentational elements remaining will besup
andsub
for superscript and subscript respectively, because they have significant non-presentational uses and are required by certain languages. All other tags are meant to be semantic instead (e. g.
for strong or bolded text) while allowing the user agent to control the presentation of elements via CSS. - The addition of RDF triple with the
property
andabout
attributes to facilitate the conversion from XHTML to RDF/XML.
XHTML5
HTML5 initially grew independently of the W3C, through a loose group of browser manufacturers and other interested parties calling themselves the WHATWG, or Web Hypertext Application Technology Working Group. The WHATWG announced the existence of an open mailing list in June 2004, along with a website bearing the strapline “Maintaining and evolving HTML since 2004.” The key motive of the group was to create a platform for dynamic web applications; they considered XHTML 2.0 to be too document-centric, and not suitable for the creation of internet forumInternet forum
An Internet forum, or message board, is an online discussion site where people can hold conversations in the form of posted messages. They differ from chat rooms in that messages are at least temporarily archived...
sites or online shops.
In April 2007, the Mozilla Foundation and Opera Software joined Apple in requesting that the newly rechartered HTML Working Group of the W3C adopt the work, under the name of HTML 5. The group resolved to do this the following month, and the First Public Working Draft of HTML5 was issued by the W3C in January 2008. The most recent W3C Working Draft was published in January 2011.
HTML5 has both a regular
text/html
serialization and an XML serialization, which is known as XHTML5. In addition to the markup language, the specification includes a number of application programming interfaces. The Document Object ModelDocument Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...
is extended with APIs for editing, drag-and-drop, data storage and network communication.
The language is more compatible with HTML 4 and XHTML 1.x than XHTML 2.0, due to the decision to keep the existing HTML form elements and events model. It adds many new elements not found in XHTML 1.x, however, such as
section
and aside
.The most recent draft includes WAI-ARIA support.
Semantic content in XHTML
XHTML+RDFaXHTML+RDFa
XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents...
is an extended version of the XHTML markup language for supporting RDF
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...
through a collection of attributes and processing rules in the form of well-formed XML documents. This host language is one of the techniques used to develop Semantic Web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
content by embedding rich semantic markup.
Valid XHTML documents
An XHTML document that conforms to an XHTML specification is said to be valid. Validity assures consistency in document code, which in turn eases processing, but does not necessarily ensure consistent rendering by browsers. A document can be checked for validity with the W3C Markup Validation ServiceW3C Markup Validation Service
The Markup Validation Service is a validator by the World Wide Web Consortium that allows Internet users to check HTML and XHTML documents for well-formed markup...
. In practice, many web development programs provide code validation based on the W3C standards.
Root element
The root element of an XHTML document must behtml
, and must contain an xmlns
attribute to associate it with the XHTML namespaceXML Namespace
xmlns tagged XML namespaces are used for providing uniquely named elements and attributes in an XML document. They are defined in a W3C recommendation. An XML instance may contain element or attribute names from more than one XML vocabulary...
. The namespace URI for XHTML is
http://www.w3.org/1999/xhtml
. The example tag below additionally features an xml:lang
attribute to identify the document with a natural languageNatural language
In the philosophy of language, a natural language is any language which arises in an unpremeditated fashion as the result of the innate facility for language possessed by the human intellect. A natural language is typically used for communication, and may be spoken, signed, or written...
:
DOCTYPEs
In order to validate an XHTML document, a Document Type DeclarationDocument Type Declaration
A Document Type Declaration, or DOCTYPE, is an instruction that associates a particular SGML or XML document with a Document Type Definition...
, or DOCTYPE, may be used. A DOCTYPE declares to the browser the Document Type Definition
Document Type Definition
Document Type Definition is a set of markup declarations that define a document type for SGML-family markup languages...
(DTD) to which the document conforms. A Document Type Declaration should be placed before the root element
Root element
Each XML document has exactly one single root element. This element is also known as the document element. It encloses all the other elements and is therefore the sole parent element to all the other elements....
.
The system identifier
System identifier
A system identifier is a document processing construct introduced in the HyTime markup language as a supplement to SGML. It was subsequently incorporated into the HTML and XML markup languages....
part of the DOCTYPE, which in these examples is the URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....
that begins with http://, need only point to a copy of the DTD to use, if the validator cannot locate one based on the public identifier
Public identifier
A public identifier is a document processing construct in SGML and XML.In HTML and XML, a public identifier is meant to be universally unique within its application scope. It typically occurs in a Document Type Declaration....
(the other quoted string). It does not need to be the specific URL that is in these examples; in fact, authors are encouraged to use local copies of the DTD files when possible. The public identifier, however, must be character-for-character the same as in the examples.
XML declaration
A character encodingCharacter encoding
A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
may be specified at the beginning of an XHTML document in the XML declaration when the document is served using the
application/xhtml+xml
MIME type. (If an XML document lacks encoding specification, an XML parser assumes that the encoding is UTF-8UTF-8
UTF-8 is a multibyte character encoding for Unicode. Like UTF-16 and UTF-32, UTF-8 can represent every character in the Unicode character set. Unlike them, it is backward-compatible with ASCII and avoids the complications of endianness and byte order marks...
or UTF-16, unless the encoding has already been determined by a higher protocol.)
For example:
-
<?xml version="1.0" encoding="UTF-8" ?>
The declaration may be optionally omitted because it declares as its encoding the default encoding. However, if the document instead makes use of XML 1.1 or another character encoding, a declaration is necessary. Internet Explorer
Internet Explorer
Windows Internet Explorer is a series of graphical web browsers developed by Microsoft and included as part of the Microsoft Windows line of operating systems, starting in 1995. It was first released as part of the add-on package Plus! for Windows 95 that year...
prior to version 7 enters quirks mode
Quirks mode
In computing, quirks mode refers to a technique used by some web browsers for the sake of maintaining backward compatibility with web pages designed for older browsers, instead of strictly complying with W3C and IETF standards in standards mode....
, if it encounters an XML declaration in a document served as
text/html
.Common errors
Some of the most common errors in the usage of XHTML are:- Not closing empty elements (elements without closing tags in HTML4)
- Incorrect:
- Correct:
Note that any of these is acceptable in XHTML:<br></br>
,<br/>
, and<br />
. Older HTML-only browsers interpreting it as HTML will generally accept<br>
and<br />
.
- Incorrect:
- Not closing non-empty elements
- Incorrect:
This is a paragraph.
This is another paragraph.
- Correct:
This is a paragraph.
This is another paragraph.
- Incorrect:
- Improperly nesting elements (Note that this would also be invalid in HTMLHTMLHyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
)- Incorrect:
This is some text. - Correct:
This is some text.
- Incorrect:
- Not putting quotation marks around attribute values
- Incorrect:
- Incorrect:
- Using the ampersand character outside of entities (Note that this would also be invalid in HTML
HTMLHyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
)- Failing to recognize that XHTML elements and attributes are case sensitive
- Incorrect:
The Best Page Ever
- Correct:
The Best Page Ever
- Using attribute minimization
- Incorrect:
- Correct:
- Misusing CDATA
CDATAThe term CDATA, meaning character data, is used for distinct, but related, purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited...
, script-comments and xml-comments when embedding scripts and stylesheets.- This problem can be avoided altogether by putting all script and stylesheet information into separate files and referring to them as follows in the XHTML
head
element.
-
- Note: The format
<script …></script>
, rather than the more concise<script … />
, is required for HTML compatibility when served as MIME typetext/html
.- If an author chooses to include script or style data inline within an XHTML document, different approaches are recommended as shown in the examples below, depending whether the author intends to serve the page as
application/xhtml+xml
and target only fully conformant browsers, or serve the page astext/html
and try to obtain usability in Internet Explorer 6 and other non-conformant browsers.
- If an author chooses to include script or style data inline within an XHTML document, different approaches are recommended as shown in the examples below, depending whether the author intends to serve the page as
- Note: The format
Backward compatibility
XHTML 1.x documents are mostly backward compatible with HTML 4 user agents when the appropriate guidelines are followed. XHTML 1.1 is essentially compatible, although the elements for ruby annotationRuby characterare small, annotative glosses that can be placed above or to the right of a Chinese character when writing languages with logographic characters such as Chinese or Japanese to show the pronunciation...
are not part of the HTML 4 specification and thus generally ignored by HTML 4 browsers. Later XHTML 1.x modules such as those for therole
attribute, RDFaRDFaRDFa is a W3C Recommendation that adds a set of attribute-level extensions to XHTML for embedding rich metadata within Web documents...
and WAI-ARIAWAI-ARIAis a draft technical specification published by the World Wide Web Consortium that specifies how to increase the accessibility of dynamic content and user interface components developed with Ajax, HTML, JavaScript and related technologies...
degrade gracefully in a similar manner.
XHTML 2.0 is significantly less compatible, although this can be mitigated to some degree through the use of scripting. (This can be simple one-liners, such as the use of “document.createElement
” to register a new HTML element within Internet Explorer, or complete JavaScript frameworks, such as the FormFaces implementation of XFormsXFormsXForms is an XML format for the specification of a data processing model for XML data and user interface for the XML data, such as web forms...
.)
Examples
The following are examples of XHTML 1.0 Strict, with both having the same visual output. The former one follows the HTML Compatibility Guidelines of the XHTML Media Types Note while the latter one breaks backward compatibility, but provides cleaner markup.
style="text-align: left;" | Media type recommendation for the examples: Media type Example 1 Example 2 application/xhtml+xml SHOULD SHOULD application/xml MAY MAY text/xml MAY MAY text/html MAY SHOULD NOT
Example 1.
Example 2.
Notes:- The "loadpdf" function is actually a workaround for Internet Explorer. It can be replaced by adding
within
. - The
img
element does not get aname
attribute in the XHTML 1.0 Strict DTD. Useid
instead.
Cross-compatibility of XHTML and HTML
HTML5 and XHTML5 serializations are largely inter-compatible if adhering to the stricter XHTML5 syntax, but there are some cases in which XHTML will not work as valid HTML5 (e.g., processing instructions are deprecated in HTML, are treated as comments, and close on the first "?", whereas they are fully allowed in XML, are treated as their own type, and close on "").
See also
- HTMLHTMLHyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
- Extensible User Interface ProtocolExtensible User Interface ProtocolThe Extensible User Interface Protocol, or XUP, is a proposed web standard. XUP is a SOAP-based protocol for communicating events in a user interface, where the user interface is described by an XML document. The specification does not limit what format the XML document is in, or what event model...
- List of XML and HTML character entity references
External links
- W3C's Markup Home Page
- XHTML 1.0 Recommendation
- XHTML 1.1 Recommendation
- XHTML 2.0 Working Group Note
- XHTML Basic
- XHTML 1.0 Strict / 1.1 Online Reference
- Links dealing with the MIME typeInternet media typeAn Internet media type, originally called a MIME type after MIME and sometimes a Content-type after the name of a header in several protocols whose value is such a type, is a two-part identifier for file formats on the Internet.The identifiers were originally defined in RFC 2046 for use in email...
of XHTML documents:- Beware of XHTML
- Sending XHTML as text/html Considered Harmful
- Serving up XHTML with the correct MIME type
- The Road to XHTML 2.0: MIME Types - Mark Pilgrim (3/19/2003). Includes examples for conditionally serving
application/xhtml+xml
using PHPPHPPHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...
, PythonPython (programming language)Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...
, and ApacheApache HTTP ServerThe Apache HTTP Server, commonly referred to as Apache , is web server software notable for playing a key role in the initial growth of the World Wide Web. In 2009 it became the first web server software to surpass the 100 million website milestone...
(mod rewrite). - Mozilla Web Author FAQ: How is the treatment of application/xhtml+xml documents different from the treatment of text/html documents? - summarizes one web browser's XHTML processing mode
- Empty elements in SGML, HTML, XML, and XHTML
- Heptagrama's Basic XHTML 1.0 Strict Tutorial
- W3C's Markup Validator
- HTML to XHTML conversion library for .NET
The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL. - Incorrect:
- Incorrect: