Cloaking
Encyclopedia
Cloaking is a search engine optimization
(SEO) technique in which the content presented to the search engine spider is different from that presented to the user's browser
. This is done by delivering content based on the IP address
es or the User-Agent HTTP header of the user requesting the page. When a user is identified as a search engine spider, a server-side script
delivers a different version of the web page
, one that contains content not present on the visible page, or that is present but not searchable. The purpose of cloaking is sometimes to deceive search engine
s so they display the page when it would not otherwise be displayed (black hat
SEO). However, it can also be a functional (though antiquated) technique for informing search engines of content they would not otherwise be able to locate because it is embedded in non-textual containers such as video or certain Adobe Flash
components.
As of 2006, better methods of accessibility, including progressive enhancement
are available, so cloaking is not considered necessary by proponents of that method. Cloaking is often used as a spamdexing
technique, to try to trick search engines into giving the relevant site a higher ranking; it can also be used to trick search engine users into visiting a site based on the search engine description which site turns out to have substantially different, or even pornographic
content. For this reason, major search engine
s consider cloaking for deception to be a violation of their guidelines, and therefore, they delist sites when deceptive cloaking is reported.
Cloaking is a form of the doorway page
technique.
A similar technique is also used on the Open Directory Project
web directory
. It differs in several ways from search engine cloaking:
In September 2007, Ralph Tegtmeier
and Ed Purkiss coined the term "mosaic cloaking" whereby dynamic pages are constructed as tiles of content and only portions of the pages, javascript
and CSS are changed, simultaneously decreasing the contrast between the cloaked page and the "friendly" page while increasing the capability for targeted delivery of content to various spiders and human visitors.
can be considered a more benign variation of cloaking, where different content is served based upon the requester's IP address
. With cloaking, search engines and people never see the other's pages, whereas, with other uses of IP delivery, both search engines and people can see the same pages. This technique is sometimes used by graphics-heavy sites that have little textual content for spiders to analyze.
One use of IP delivery is to determine the requestor's location, and deliver content specifically written for that country. This isn't necessarily cloaking. For instance, Google uses IP delivery for AdWords
and AdSense
advertising
programs to target users in different geographic locations.
IP delivery is a crude and unreliable method of determining the language in which to provide content. Many countries and regions are multi-lingual, or the requestor may be a foreign national. A better method of content negotiation
is to examine the client's
As of 2006, many sites have taken up IP delivery to personalise content for their regular customers. Many of the top 1000 sites, including sites like Amazon
(amazon.com), actively use IP delivery. None of these have been banned from search engines as their intent is not deceptive.
Search engine optimization
Search engine optimization is the process of improving the visibility of a website or a web page in search engines via the "natural" or un-paid search results...
(SEO) technique in which the content presented to the search engine spider is different from that presented to the user's browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...
. This is done by delivering content based on the IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...
es or the User-Agent HTTP header of the user requesting the page. When a user is identified as a search engine spider, a server-side script
Scripting language
A scripting language, script language, or extension language is a programming language that allows control of one or more applications. "Scripts" are distinct from the core code of the application, as they are usually written in a different language and are often created or at least modified by the...
delivers a different version of the web page
Web page
A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...
, one that contains content not present on the visible page, or that is present but not searchable. The purpose of cloaking is sometimes to deceive search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...
s so they display the page when it would not otherwise be displayed (black hat
Black hat
A black hat is the villain or bad guy, especially in a western movie in which such a character would stereotypically wear a black hat in contrast to the hero's white hat, especially in black and white movies....
SEO). However, it can also be a functional (though antiquated) technique for informing search engines of content they would not otherwise be able to locate because it is embedded in non-textual containers such as video or certain Adobe Flash
Adobe Flash
Adobe Flash is a multimedia platform used to add animation, video, and interactivity to web pages. Flash is frequently used for advertisements, games and flash animations for broadcast...
components.
As of 2006, better methods of accessibility, including progressive enhancement
Progressive enhancement
Progressive enhancement is a strategy for web design that emphasizes accessibility, semantic HTML markup, and external stylesheet and scripting technologies...
are available, so cloaking is not considered necessary by proponents of that method. Cloaking is often used as a spamdexing
Spamdexing
In computing, spamdexing is the deliberate manipulation of search engine indexes...
technique, to try to trick search engines into giving the relevant site a higher ranking; it can also be used to trick search engine users into visiting a site based on the search engine description which site turns out to have substantially different, or even pornographic
Pornography
Pornography or porn is the explicit portrayal of sexual subject matter for the purposes of sexual arousal and erotic satisfaction.Pornography may use any of a variety of media, ranging from books, magazines, postcards, photos, sculpture, drawing, painting, animation, sound recording, film, video,...
content. For this reason, major search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...
s consider cloaking for deception to be a violation of their guidelines, and therefore, they delist sites when deceptive cloaking is reported.
Cloaking is a form of the doorway page
Doorway page
Doorway pages are web pages that are created for spamdexing, this is, for spamming the index of a search engine by inserting results for particular phrases with the purpose of sending visitors to a different page. They are also known as bridge pages, portal pages, jump pages, gateway pages, entry...
technique.
A similar technique is also used on the Open Directory Project
Open Directory Project
The Open Directory Project , also known as Dmoz , is a multilingual open content directory of World Wide Web links. It is owned by Netscape but it is constructed and maintained by a community of volunteer editors.ODP uses a hierarchical ontology scheme for organizing site listings...
web directory
Web directory
A web directory or link directory is a directory on the World Wide Web. It specializes in linking to other web sites and categorizing those links....
. It differs in several ways from search engine cloaking:
- It is intended to fool human editors, rather than computer search engine spiders.
- The decision to cloak or not is often based upon the HTTP referrer, the user agent or the visitor's IP; but more advanced techniques can be also based upon the client's behaviour analysis after a few page requests: the raw quantity, the sorting of, and latency between subsequent HTTP requests sent to a website's pages, plus the presence of a check for robots.txt file, are some of the parameters in which search engines spiders differ heavily from a natural user behaviour. The referrer tells the URLUniform Resource LocatorIn computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....
of the page on which a user clicked a link to get to the page. Some cloakers will give the fake page to anyone who comes from a web directory website, since directory editors will usually examine sites by clicking on links that appear on a directory web page. Other cloakers give the fake page to everyone except those coming from a major search engine; this makes it harder to detect cloaking, while not costing them many visitors, since most people find websites by using a search engine.
Black hat perspective
Increasingly, for a page without natural popularity due to compelling or rewarding content to rank well in the search engines, webmasters may be tempted to design pages solely for the search engines. This results in pages with too many keywords and other factors that might be search engine "friendly", but make the pages difficult for actual visitors to consume. As such, black hat SEO practitioners consider cloaking to be an important technique to allow webmasters to split their efforts and separately target the search engine spiders and human visitors.In September 2007, Ralph Tegtmeier
Ralph Tegtmeier
Ralph Tegtmeier , also known as Frater V∴D∴ and Frater U∴D∴ , is a German occultist, a member of the Fraternitas Saturni, and an ex-member of the Illuminates of Thanateros.Horst E...
and Ed Purkiss coined the term "mosaic cloaking" whereby dynamic pages are constructed as tiles of content and only portions of the pages, javascript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....
and CSS are changed, simultaneously decreasing the contrast between the cloaked page and the "friendly" page while increasing the capability for targeted delivery of content to various spiders and human visitors.
Cloaking versus IP delivery
IP deliveryGeo targeting
Geo targeting in geomarketing and internet marketing is the method of determining the geolocation of a website visitor and delivering different content to that visitor based on his or her location, such as country, region/state, city, metro code/zip code, organization, IP address, ISP or other...
can be considered a more benign variation of cloaking, where different content is served based upon the requester's IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...
. With cloaking, search engines and people never see the other's pages, whereas, with other uses of IP delivery, both search engines and people can see the same pages. This technique is sometimes used by graphics-heavy sites that have little textual content for spiders to analyze.
One use of IP delivery is to determine the requestor's location, and deliver content specifically written for that country. This isn't necessarily cloaking. For instance, Google uses IP delivery for AdWords
AdWords
Google AdWords is Google's main advertising product and main source of revenue. Google's total advertising revenues were USD$28 billion in 2010. AdWords offers pay-per-click advertising, cost-per-thousand advertising, and site-targeted advertising for text, banner, and rich-media ads. The AdWords...
and AdSense
AdSense
Google AdSense which is a program run by Google Inc. allows publishers in the Google Network of content sites to automatically serve text, image, video, and rich media adverts that are targeted to site content and audience. These adverts are administered, sorted, and maintained by Google, and they...
advertising
Advertising
Advertising is a form of communication used to persuade an audience to take some action with respect to products, ideas, or services. Most commonly, the desired result is to drive consumer behavior with respect to a commercial offering, although political and ideological advertising is also common...
programs to target users in different geographic locations.
IP delivery is a crude and unreliable method of determining the language in which to provide content. Many countries and regions are multi-lingual, or the requestor may be a foreign national. A better method of content negotiation
Content negotiation
Content negotiation is a mechanism defined in the HTTP specification that makes it possible to serve different versions of a document at the same URI, so that user agents can specify which version fit their capabilities the best...
is to examine the client's
Accept-Language
HTTP header.As of 2006, many sites have taken up IP delivery to personalise content for their regular customers. Many of the top 1000 sites, including sites like Amazon
Amazon.com
Amazon.com, Inc. is a multinational electronic commerce company headquartered in Seattle, Washington, United States. It is the world's largest online retailer. Amazon has separate websites for the following countries: United States, Canada, United Kingdom, Germany, France, Italy, Spain, Japan, and...
(amazon.com), actively use IP delivery. None of these have been banned from search engines as their intent is not deceptive.
See also
- Cloaking deviceCloaking deviceCloaking devices are advanced stealth technologies still in development that will cause objects, such as spaceships or individuals, to be partially or wholly invisible to parts of the electromagnetic spectrum...
- Search Engine OptimizationSearch engine optimizationSearch engine optimization is the process of improving the visibility of a website or a web page in search engines via the "natural" or un-paid search results...
- SpamdexingSpamdexingIn computing, spamdexing is the deliberate manipulation of search engine indexes...
- Doorway pageDoorway pageDoorway pages are web pages that are created for spamdexing, this is, for spamming the index of a search engine by inserting results for particular phrases with the purpose of sending visitors to a different page. They are also known as bridge pages, portal pages, jump pages, gateway pages, entry...
- Keyword stuffingKeyword stuffingKeyword stuffing is considered to be an unethical search engine optimization technique. Keyword stuffing occurs when a web page is loaded with keywords in the meta tags or in content...
- Link farmLink farmOn the World Wide Web, a link farm is any group of web sites that all hyperlink to every other site in the group. Although some link farms can be created by hand, most are created through automated programs and services. A link farm is a form of spamming the index of a search engine...
s - SpamdexingSpamdexingIn computing, spamdexing is the deliberate manipulation of search engine indexes...
- URL redirectionURL redirectionURL redirection, also called URL forwarding and the very similar technique domain redirection also called domain forwarding, are techniques on the World Wide Web for making a web page available under many URLs.- Similar domain names :...
- Doorway page
- Technology:
- Content negotiationContent negotiationContent negotiation is a mechanism defined in the HTTP specification that makes it possible to serve different versions of a document at the same URI, so that user agents can specify which version fit their capabilities the best...
- Geo targetingGeo targetingGeo targeting in geomarketing and internet marketing is the method of determining the geolocation of a website visitor and delivering different content to that visitor based on his or her location, such as country, region/state, city, metro code/zip code, organization, IP address, ISP or other...
- Content negotiation