Web search engine - AbsoluteAstronomy.com

A web search engine is designed to search for information on the World Wide Web

World Wide Web

The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

and FTP servers. The search results are generally presented in a list of results often referred to as SERPS, or "search engine results pages". The information may consist of web page

Web page

A web page or webpage is a document or information resource that is suitable for the World Wide Web and can be accessed through a web browser and displayed on a monitor or mobile device. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext...

s, images, information and other types of files. Some search engines also mine data

Data mining

Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...

available in database

Database

A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...

s or open directories

Web directory

A web directory or link directory is a directory on the World Wide Web. It specializes in linking to other web sites and categorizing those links....

. Unlike web directories, which are maintained only by human editors, search engines also maintain real-time

Real-time computing

In computer science, real-time computing , or reactive computing, is the study of hardware and software systems that are subject to a "real-time constraint"— e.g. operational deadlines from event to system response. Real-time programs must guarantee response within strict time constraints...

information by running an algorithm

Algorithm

In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

on a web crawler

Web crawler

A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.This process is called Web...

History

Timeline (full list)
Year	Engine	Current status
1993	W3Catalog W3Catalog W3 Catalog was a very early web search engine, first released on September 2, 1993 by developer Oscar Nierstrasz at the University of Geneva.Unlike later search engines, like Aliweb, which attempt to index the web by crawling over the accessible content of web sites, W3 Catalog exploited the fact...	Closed
	Aliweb Aliweb ALIWEB is considered the first Web search engine, as its predecessors were either built with different purposes or were literally just indexers ....	Closed
	JumpStation JumpStation JumpStation was the first WWW search engine that behaved, and appeared to the user, the way current web search engines do. It started indexing on Sunday 12 December 1993 and was announced on the Mosaic "What's New" webpage on 21 December 1993...	Closed
1994	WebCrawler WebCrawler WebCrawler is a metasearch engine that blends the top search results from Google, Yahoo!, Bing Search , Ask.com, About.com, MIVA, LookSmart and other popular search engines. WebCrawler also provides users the option to search for images, audio, video, news, yellow pages and white pages...	Active, Aggregator
	Go.com Go.com Go.com is a web portal first launched by Jeff Gold, and now operated by the Walt Disney Internet Group, which is a part of The Walt Disney Company. The portal includes content from ABC News, ESPN, and FamilyFun.com, all of which are associated with Disney and are hosted under a .go.com name...	Active, Yahoo Search
	Lycos Lycos Lycos, Inc. is a search engine and web portal established in 1994. Lycos also encompasses a network of email, webhosting, social networking, and entertainment websites.-Corporate history:...	Active
1995	AltaVista AltaVista AltaVista is a web search engine owned by Yahoo!. AltaVista was once one of the most popular search engines but its popularity declined with the rise of Google...	Bought and operated by Yahoo!
	Daum	Active
	Magellan Magellan Magellan may refer to:Ferdinand Magellan, a Portuguese explorer who led part of the first expedition around the worldMagellan , a progressive rock band*Magellan , a forerunner of the Excite web portal...	Closed
	Excite Excite Excite is a collection of Internet sites and services owned by IAC Search & Media, which is a subsidiary of InterActive Corporation . Launched in 1994, it is an online service offering a variety of content, including an Internet portal, a search engine, a web-based email, instant messaging, stock...	Active
	SAPO	Active
	Yahoo! Yahoo! Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...	Active, Launched as a directory
1996	Dogpile Dogpile Dogpile is a metasearch engine that fetches results from Google, Yahoo!, Bing, Ask.com, About.com and several other popular search engines, including those from audio and video content providers. It is a registered trademark of InfoSpace, Inc.- History :...	Active, Aggregator
	Inktomi Inktomi Inktomi Corporation was a California company that provided software for Internet service providers. It was founded in 1996 by UC Berkeley professor Eric Brewer and graduate student Paul Gauthier. The company was initially founded based on the real-world success of the search engine they developed...	Acquired by Yahoo!
	HotBot HotBot HotBot is a web search engine launched in May 1996 by Wired Magazine. It is currently owned by Lycos. HotBot became a popular tool with search results served by the Inktomi database and directory results provided originally by LookSmart and then the Open Directory Project since mid-1999...	Active (lycos.com)
	Ask Jeeves Ask.com Ask is a Q&A focused search engine founded in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original software was implemented by Gary Chevsky from his own design. Warthen, Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine...	Active (ask.com, Jeeves went away)
1997	Northern Light Northern Light Group Northern Light Group, LLC is a company specializing in strategic research portals, enterprise search technology, and text analytics solutions. The company provides custom, hosted, turnkey solutions for its clients using the software as a service delivery model. Northern Light markets its...	Closed
1997	Yandex Yandex Yandex is a Russian IT company which operates the largest search engine in Russia and develops a number of Internet-based services and products. Yandex is ranked as 5-th world largest search engine...	Active
1998	Google Google search Google or Google Web Search is a web search engine owned by Google Inc. Google Search is the most-used search engine on the World Wide Web, receiving several hundred million queries each day through its various services....	Active
1998	MSN Search	Active as Bing
1999	AlltheWeb AlltheWeb AlltheWeb was an Internet search engine that made its debut in mid-1999. It grew out of FTP Search, Tor Egge's doctorate thesis at the Norwegian University of Science and Technology, which he started in 1994, which in turn resulted in the formation of Fast Search and Transfer, established on July...	Closed (URL redirected to Yahoo!)
	GenieKnows GenieKnows GenieKnows is a division of IT Interactive Services Inc., a privately owned vertical search engine company based in Halifax, Nova Scotia. Like many internet search engines, its revenue model centers on an online advertising platform and B2B transactions...	Active, rebranded Yellowee.com
	Naver	Active
	Teoma Teoma Teoma, pronounced chawmuh , was an Internet search engine founded in 2000 by Professor Apostolos Gerasoulis and his colleagues at Rutgers University in New Jersey. Professor Tao Yang from the University of California, Santa Barbara co-led technology R&D. Their research grew out of the 1998 DiscoWeb...	Active
	Vivisimo Vivísimo Vivisimo is a privately held enterprise search software company in Pittsburgh that develops and sells software products to improve search on the web and in enterprises...	Closed
2000	Baidu Baidu Baidu, Inc. , simply known as Baidu and incorporated on January 18, 2000, is a Chinese web services company headquartered in the Baidu Campus in Haidian District, Beijing, People's Republic of China....	Active
2000	Exalead Exalead Exalead is a software company that provides search platforms and search-based applications for consumer and business users. The company is headquartered in Paris, France, and is a subsidiary of Dassault Systèmes .- CloudView Platform :...	Acquired by Dassault Systèmes Dassault Systemes Dassault Systèmes S.A. is a leading company specializing in 3D and PLM software.Dassault Systèmes develops and markets PLM application software and services that support industrial processes and provide a 3D vision of the entire lifecycle of products from conception to maintenance to recycling...
2002	Inktomi Inktomi Inktomi Corporation was a California company that provided software for Internet service providers. It was founded in 1996 by UC Berkeley professor Eric Brewer and graduate student Paul Gauthier. The company was initially founded based on the real-world success of the search engine they developed...	Acquired by Yahoo!
2003	Info.com Info.com Info.com is a metasearch engine which provides results from leading search engines and pay-per-click directories, including Google, Yahoo!, Bing.com, Ask, LookSmart, About and Open Directory....	Active
2004	Yahoo! Search Yahoo! Search Yahoo! Search is a web search engine, owned by Yahoo! Inc. and was , the 2nd largest search engine on the web by query volume, at 6.42%, after its competitor Google at 85.35% and before Baidu at 3.67%, according to Net Applications....	Active, Launched own web search (see Yahoo! Directory, 1995)
	A9.com A9.com A9.com is a subsidiary of Amazon.com based in Palo Alto, California that develops search engine technology. A9 currently has over 100 employees in its Palo Alto, Bangalore, and Dublin offices.A9 has worked in 3 areas over the years....	Closed
	Sogou	Active
2005	AOL Search	Active
	Ask.com Ask.com Ask is a Q&A focused search engine founded in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original software was implemented by Gary Chevsky from his own design. Warthen, Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine...	Active
	GoodSearch GoodSearch GoodSearch is a Yahoo-powered search engine that donates 50% of its revenue, about a penny per search, to listed American charities and schools designated by its users. The money donated comes from the site's advertisers...	Active
	SearchMe SearchMe SearchMe was a visual search engine based in Mountain View, California. It organized search results as snapshots of web pages — an interface similar to that of the iPhone's and iTunes's album selection....	Closed
2006	wikiseek Wikiseek Wikiseek was a search engine that indexed Wikipedia pages and pages that were linked to from Wikipedia articles. The search engine was funded by a Palo Alto based Internet startup SearchMe and was officially launched on January 17, 2007. Most of the funding came from Sequoia Capital. It used Google...	Active
	Quaero Quaero Quaero is a European research and development program with the goal of developing multimedia and multilingual indexing and management tools for professional and general public applications . The European Commission approved the aid granted by France on 11 March 2008.This program is supported by the...	Active
	Ask.com Ask.com Ask is a Q&A focused search engine founded in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original software was implemented by Gary Chevsky from his own design. Warthen, Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine...	Active
	Live Search	Active as Bing, Launched as rebranded MSN Search
	ChaCha ChaCha (search engine) ChaCha is a search engine that specializes in answering questions through a technique known as the human search engine. ChaCha was created by Scott A. Jones and Brad Bostic...	Active
	Guruji.com Guruji.com Guruji.com is an Indian Internet search engine that is focused on providing better search results to Indian consumers, by leveraging proprietary algorithms and data in the Indian context.-The Concept:...	Active
2007	wikiseek Wikiseek Wikiseek was a search engine that indexed Wikipedia pages and pages that were linked to from Wikipedia articles. The search engine was funded by a Palo Alto based Internet startup SearchMe and was officially launched on January 17, 2007. Most of the funding came from Sequoia Capital. It used Google...	Closed
	Sproose Sproose Sproose is a consumer search engine launched in August 2007 by founder Bob Pack. Sproose provides web search results from partners including MSN, Yahoo! and Ask.com...	Closed
	Wikia Search Wikia Search Wikia Search was a short-lived free and open-source Web search engine launched by Wikia, a for-profit wiki-hosting company founded in late 2004 by Jimmy Wales and Angela Beesley....	Closed
	Blackle.com Blackle.com Blackle is a website powered by Google Custom Search and created by Heap Media, which aims to save energy by displaying a black background and using grayish-white font color for search results...	Active
2008	Powerset Powerset (company) Powerset is a Microsoft owned company based in San Francisco, California that, in 2006, was developing a natural language search engine for the Internet....	Acquired by Microsoft
	Picollator Picollator Picollator - Internet search engine that performs search for web sites and multimedia by visual query or text, or a combination of visual query and text...	Closed
	Viewzi Viewzi Viewzi was a search engine company based in Dallas, Texas that developed a highly visual experience that tailored the way users look at information based on what they are looking for. The search engine lightened the data overload by filtering and grouping results into several distinct interfaces...	Closed
	Boogami Boogami Boogami is a search engine that was developed by James Wildish, a sixteen year old college student from Kent in United Kingdom. Prior to launch James, gained a partnership with Yahoo, to reduce server loads by using their search feed, as the operational costs of running his own web spider were too...	Active
	LeapFish LeapFish Leapfish.com and its parent company, dotnext, web site are both down. Their sites no longer display their homepages.LeapFish.com is a search aggregator that retrieves results from other portals and search engines, including Google, Bing and Yahoo!, and also search engines of Blogs, Videos etc...	Closed
	Forestle Forestle Forestle is an ecologically inspired search engine created by Christian Kroll, Wittenberg, Germany, in 2008. Forestle is a website for finding all kinds of information on the internet; Forestle helps to save the rain forest and aims to reduce CO2 emissions...	Active
	VADLO VADLO VADLO is a life sciences search engine, privately owned by Life in Research, LLC., based in Illinois, USA. VADLO caters to life sciences and biomedical researchers, educators, students, clinicians and reference librarians...	Active
	Duck Duck Go Duck Duck Go DuckDuckGo is a search engine that is based in Valley Forge, Pennsylvania and uses information from crowd-sourced sites with the aim of augmenting traditional results and improving relevance...	Active, Aggregator
2009	Bing Bing Bing is a web search engine from Microsoft.Bing may also refer to:* An onomatopœia of a bell sound* Bing cherry, a variety of cherry* Bing , Chinese flatbread* Bing , a German company that manufactured toys and kitchen utensils...	Active, Launched as rebranded Live Search
	Yebol Yebol Yebol is a vertical "decision" search engine that had developed a knowledge-based, semantic search platform. Based in San Jose, CA, Yebol's artificial intelligence human intelligence-infused algorithms automatically cluster and categorize search results, web sites, pages and contents that it...	Active
	Search2.net Search2.net Search2.net is a search engine with an international index that is built on Nutch and online since 2009.Search2.net is located on server in Israel and has currently 5 million sites indexed.The search engine supported the OpenSearch description....	Active
	Mugurdy Mugurdy Mugurdy is a visual search engine that launched its public beta in July 2009. In addition to textual results of search queries it shows a screenshot of each webpage. Mugurdy, created by Software du Jour, is the first company to enter the Guinness Enterprise Centre’s Microsoft BizSpark...	Closed due to a lack of funding
Goby Goby Inc. Goby is a deep web search engine which launched in September 2009. The site searches selected databases and other sources of information on the web focused on 400 categories of things to do while traveling. Signed in users may also share their results utilizing the facebook connect applications...	Active
2010	Yandex Yandex Yandex is a Russian IT company which operates the largest search engine in Russia and develops a number of Internet-based services and products. Yandex is ranked as 5-th world largest search engine...	Active, Launched global (English) search
	Cuil Cuil Cuil was a search engine that organized web pages by content and displayed relatively long entries along with thumbnail pictures for many results. Cuil said it had a larger index than any other search engine, with about 120 billion web pages. It went live on July 28, 2008...	Closed
	Blekko Blekko Blekko is a web search engine whose goal is to provide better search results than those offered by Google Search, by offering results culled from a set of 3 billion trusted websites and excluding material from such sites as content farms...	Active
	Yummly Yummly Yummly is a semantic web search engine for food, cooking and recipes. It ‘understands’ food on a variety of levels, allows users to search by ingredient, diet, allergy, nutrition, price, cuisine, time, taste, meal courses and sources, and ‘learns’ about users based on their likes and dislikes....	Active
	Solusee	Active
2011
	Interred Interred educational search engine Interred is a semantic web search engine for education. It was created at the National College 17 "Primera Junta" of Argentina, with websites of different subjects: biology, sociology, psychology, physics, mathematics, literature, English, Italian, French, Spanish, geology, others.Interred is...	Active

During the early development of the web, there was a list of webservers edited by Tim Berners-Lee

Tim Berners-Lee

Sir Timothy John "Tim" Berners-Lee, , also known as "TimBL", is a British computer scientist, MIT professor and the inventor of the World Wide Web...

and hosted on the CERN

CERN

The European Organization for Nuclear Research , known as CERN , is an international organization whose purpose is to operate the world's largest particle physics laboratory, which is situated in the northwest suburbs of Geneva on the Franco–Swiss border...

webserver. One historical snapshot from 1992 remains. As more webservers went online the central list could not keep up. On the NCSA

NCSA

NCSA may refer to:*National Center for Supercomputing Applications**NCSA HTTPd, an early webserver developed at this center*University of North Carolina School of the Arts*National Cyber Security Alliance...

site new servers were announced under the title "What's New!"

The very first tool used for searching on the Internet was Archie

Archie search engine

Archie is a tool for indexing FTP archives, allowing people to find specific files. It is considered to be the first Internet search engine. The original implementation was written in 1990 by Alan Emtage, Bill Heelan, and J...

.
The name stands for "archive" without the "v". It was created in 1990 by Alan Emtage

Alan Emtage

Alan Emtage conceived and implemented the first version of Archie, a pre-Web internet search engine for locating material in public FTP archives....

, Bill Heelan and J. Peter Deutsch, computer science students at McGill University

McGill University

Mohammed Fathy is a public research university located in Montreal, Quebec, Canada. The university bears the name of James McGill, a prominent Montreal merchant from Glasgow, Scotland, whose bequest formed the beginning of the university...

in Montreal

Montreal

Montreal is a city in Canada. It is the largest city in the province of Quebec, the second-largest city in Canada and the seventh largest in North America...

. The program downloaded the directory listings of all the files located on public anonymous FTP (File Transfer Protocol

File Transfer Protocol

File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...

) sites, creating a searchable database of file names; however, Archie did not index the contents of these sites since the amount of data was so limited it could be readily searched manually.

The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota

University of Minnesota

The University of Minnesota, Twin Cities is a public research university located in Minneapolis and St. Paul, Minnesota, United States. It is the oldest and largest part of the University of Minnesota system and has the fourth-largest main campus student body in the United States, with 52,557...

) led to two new search programs, Veronica

Veronica (computer)

Veronica is a search engine system for the Gopher protocol, developed in 1992 by Steven Foster and Fred Barrie at the University of Nevada, Reno.Veronica is a constantly updated database of the names of almost every menu item on thousands of Gopher servers...

and Jughead

Jughead (computer)

Jughead is a search engine system for the Gopher protocol. It is distinct from Veronica in that it searches a single server at a time.Jughead is officially an acronym for Jonzy's Universal Gopher Hierarchy Excavation And Display, though it was originally chosen to match that of the FTP search...

. Like Archie, they searched the file names and titles stored in Gopher index systems. Veronica (Very Easy Rodent-Oriented Net-wide Index to Computerized Archives) provided a keyword search of most Gopher menu titles in the entire Gopher listings. Jughead (Jonzy's Universal Gopher Hierarchy Excavation And Display) was a tool for obtaining menu information from specific Gopher servers. While the name of the search engine "Archie" was not a reference to the Archie comic book

Archie Comics

Archie Comics is an American comic book publisher headquartered in the Village of Mamaroneck, Town of Mamaroneck, New York, known for its many series featuring the fictional teenagers Archie Andrews, Betty Cooper, Veronica Lodge, Reggie Mantle and Jughead Jones. The characters were created by...

series, "Veronica

Veronica Lodge

Veronica Lodge is a fictional character in the Archie Comics books series.-Fictional history and character:She is called both by her name Veronica and her nickname Ronnie...

" and "Jughead

Jughead Jones

Jughead Jones is a fictional character in Archie Comics who first appeared in the comic in December 1941. He is the son of Forsythe II; although in one of the early Archie newspaper comic strips, he himself is identified as Forsythe Van Jones II...

" are characters in the series, thus referencing their predecessor.

In the summer of 1993, no search engine existed yet for the web, though numerous specialized catalogues were maintained by hand. Oscar Nierstrasz

Oscar Nierstrasz

Oscar Marius Nierstrasz, born , is a Professor at the Computer Science Institute at the University of Berne. He is active in the field of...

at the University of Geneva

University of Geneva

The University of Geneva is a public research university located in Geneva, Switzerland.It was founded in 1559 by John Calvin, as a theological seminary and law school. It remained focused on theology until the 17th century, when it became a center for Enlightenment scholarship. In 1873, it...

wrote a series of Perl

Perl

Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

scripts that would periodically mirror these pages and rewrite them into a standard format which formed the basis for W3Catalog

W3Catalog

W3 Catalog was a very early web search engine, first released on September 2, 1993 by developer Oscar Nierstrasz at the University of Geneva.Unlike later search engines, like Aliweb, which attempt to index the web by crawling over the accessible content of web sites, W3 Catalog exploited the fact...

, the web's first primitive search engine, released on September 2, 1993.

In June 1993, Matthew Gray, then at MIT

Massachusetts Institute of Technology

The Massachusetts Institute of Technology is a private research university located in Cambridge, Massachusetts. MIT has five schools and one college, containing a total of 32 academic departments, with a strong emphasis on scientific and technological education and research.Founded in 1861 in...

, produced what was probably the first web robot, the Perl

Perl

-based World Wide Web Wanderer

World Wide Web Wanderer

The World Wide Web Wanderer, also referred to as just the Wanderer, was a Perl-based web crawler that was first deployed in June 1993 to measure the size of the World Wide Web. The Wanderer was developed at the Massachusetts Institute of Technology by Matthew Gray, who now works for Google. It was...

, and used it to generate an index called 'Wandex'. The purpose of the Wanderer was to measure the size of the World Wide Web, which it did until late 1995. The web's second search engine Aliweb

Aliweb

ALIWEB is considered the first Web search engine, as its predecessors were either built with different purposes or were literally just indexers ....

appeared in November 1993. Aliweb did not use a web robot, but instead depended on being notified by website administrators of the existence at each site of an index file in a particular format.

JumpStation

JumpStation

JumpStation was the first WWW search engine that behaved, and appeared to the user, the way current web search engines do. It started indexing on Sunday 12 December 1993 and was announced on the Mosaic "What's New" webpage on 21 December 1993...

(released in December 1993) used a web robot

Web crawler

to find web pages and to build its index, and used a web form as the interface to its query program. It was thus the first WWW resource-discovery tool to combine the three essential features of a web search engine (crawling, indexing, and searching) as described below. Because of the limited resources available on the platform on which it ran, its indexing and hence searching were limited to the titles and headings found in the web pages the crawler encountered.

One of the first "full text" crawler-based search engines was WebCrawler

WebCrawler

WebCrawler is a metasearch engine that blends the top search results from Google, Yahoo!, Bing Search , Ask.com, About.com, MIVA, LookSmart and other popular search engines. WebCrawler also provides users the option to search for images, audio, video, news, yellow pages and white pages...

, which came out in 1994. Unlike its predecessors, it let users search for any word in any webpage, which has become the standard for all major search engines since. It was also the first one to be widely known by the public. Also in 1994, Lycos

Lycos

Lycos, Inc. is a search engine and web portal established in 1994. Lycos also encompasses a network of email, webhosting, social networking, and entertainment websites.-Corporate history:...

(which started at Carnegie Mellon University

Carnegie Mellon University

Carnegie Mellon University is a private research university in Pittsburgh, Pennsylvania, United States....

) was launched and became a major commercial endeavor.

Soon after, many search engines appeared and vied for popularity. These included Magellan (search engine), Excite

Excite

Excite is a collection of Internet sites and services owned by IAC Search & Media, which is a subsidiary of InterActive Corporation . Launched in 1994, it is an online service offering a variety of content, including an Internet portal, a search engine, a web-based email, instant messaging, stock...

, Infoseek

Infoseek

Infoseek was a popular search engine founded in 1994 by Steve Kirsch.Infoseek was originally operated by the Infoseek Corporation, headquartered in Sunnyvale, California. Infoseek was bought by The Walt Disney Company in 1998, and the technology was merged with that of the Disney-acquired Starwave...

, Inktomi

Inktomi

Inktomi Corporation was a California company that provided software for Internet service providers. It was founded in 1996 by UC Berkeley professor Eric Brewer and graduate student Paul Gauthier. The company was initially founded based on the real-world success of the search engine they developed...

, Northern Light

Northern Light Group

Northern Light Group, LLC is a company specializing in strategic research portals, enterprise search technology, and text analytics solutions. The company provides custom, hosted, turnkey solutions for its clients using the software as a service delivery model. Northern Light markets its...

, and AltaVista

AltaVista

AltaVista is a web search engine owned by Yahoo!. AltaVista was once one of the most popular search engines but its popularity declined with the rise of Google...

. Yahoo!

Yahoo!

Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...

was among the most popular ways for people to find web pages of interest, but its search function operated on its web directory

Web directory

A web directory or link directory is a directory on the World Wide Web. It specializes in linking to other web sites and categorizing those links....

, rather than full-text copies of web pages. Information seekers could also browse the directory instead of doing a keyword-based search.

In 1996, Netscape

Netscape

Netscape Communications is a US computer services company, best known for Netscape Navigator, its web browser. When it was an independent company, its headquarters were in Mountain View, California...

was looking to give a single search engine an exclusive deal to be the featured search engine on Netscape's web browser. There was so much interest that instead a deal was struck with Netscape by five of the major search engines, where for $5 million per year each search engine would be in rotation on the Netscape search engine page. The five engines were Yahoo!, Magellan, Lycos, Infoseek, and Excite.

Search engines were also known as some of the brightest stars in the Internet investing frenzy that occurred in the late 1990s. Several companies entered the market spectacularly, receiving record gains during their initial public offering

Initial public offering

An initial public offering or stock market launch, is the first sale of stock by a private company to the public. It can be used by either small or large companies to raise expansion capital and become publicly traded enterprises...

s. Some have taken down their public search engine, and are marketing enterprise-only editions, such as Northern Light. Many search engine companies were caught up in the dot-com bubble

Dot-com bubble

The dot-com bubble was a speculative bubble covering roughly 1995–2000 during which stock markets in industrialized nations saw their equity value rise rapidly from growth in the more...

, a speculation-driven market boom that peaked in 1999 and ended in 2001.

Around 2000, Google's search engine

Google search

Google or Google Web Search is a web search engine owned by Google Inc. Google Search is the most-used search engine on the World Wide Web, receiving several hundred million queries each day through its various services....

rose to prominence. The company achieved better results for many searches with an innovation called PageRank

PageRank

PageRank is a link analysis algorithm, named after Larry Page and used by the Google Internet search engine, that assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set...

. This iterative algorithm ranks web pages based on the number and PageRank of other web sites and pages that link there, on the premise that good or desirable pages are linked to more than others. Google also maintained a minimalist interface to its search engine. In contrast, many of its competitors embedded a search engine in a web portal

Web portal

A web portal or links page is a web site that functions as a point of access to information in the World Wide Web. A portal presents information from diverse sources in a unified way....

.

By 2000, Yahoo! was providing search services based on Inktomi's search engine. Yahoo! acquired Inktomi in 2002, and Overture

Overture

Overture in music is the term originally applied to the instrumental introduction to an opera...

(which owned AlltheWeb

AlltheWeb

AlltheWeb was an Internet search engine that made its debut in mid-1999. It grew out of FTP Search, Tor Egge's doctorate thesis at the Norwegian University of Science and Technology, which he started in 1994, which in turn resulted in the formation of Fast Search and Transfer, established on July...

and AltaVista) in 2003. Yahoo! switched to Google's search engine until 2004, when it launched its own search engine based on the combined technologies of its acquisitions.

Microsoft first launched MSN Search in the fall of 1998 using search results from Inktomi. In early 1999 the site began to display listings from Looksmart

LookSmart

LookSmart is an online advertising company based in San Francisco. LookSmart provides search advertising products and services to text advertisers, as well as targeted pay-per-click search and contextual advertising via its Search Advertising Network...

blended with results from Inktomi except for a short time in 1999 when results from AltaVista were used instead. In 2004, Microsoft

Microsoft

Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...

began a transition to its own search technology, powered by its own web crawler

Web crawler

(called msnbot

Msnbot

msnbot was a web-crawling robot , deployed by Microsoft to collect documents from the web to build a searchable index for the MSN Search engine. It went into beta in 2004, and had full public release in 2005. The month of October 2010 saw the official retirement of msnbot and its replacement by...

).

Microsoft's rebranded search engine, Bing

Bing

Bing is a web search engine from Microsoft.Bing may also refer to:* An onomatopœia of a bell sound* Bing cherry, a variety of cherry* Bing , Chinese flatbread* Bing , a German company that manufactured toys and kitchen utensils...

, was launched on June 1, 2009. On July 29, 2009, Yahoo! and Microsoft finalized a deal in which Yahoo! Search

Yahoo! Search

Yahoo! Search is a web search engine, owned by Yahoo! Inc. and was , the 2nd largest search engine on the web by query volume, at 6.42%, after its competitor Google at 85.35% and before Baidu at 3.67%, according to Net Applications....

would be powered by Microsoft Bing technology.

How web search engines work

A search engine operates in the following order:

Web crawling
Indexing
Index (search engine)
Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and computer science...
Searching
Web search query
A web search query is a query that a user enters into web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are unstructured and often ambiguous; they vary greatly from standard query languages which are governed by strict syntax rules.- Types...

Web search engines work by storing information about many web pages, which they retrieve from the html itself. These pages are retrieved by a Web crawler

Web crawler

(sometimes also known as a spider) — an automated Web browser which follows every link on the site. Exclusions can be made by the use of robots.txt. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called meta tags). Data about web pages are stored in an index database for use in later queries. A query can be a single word. The purpose of an index is to allow information to be found as quickly as possible. Some search engines, such as Google

Google

Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

, store all or part of the source page (referred to as a cache

Web cache

A web cache is a mechanism for the temporary storage of web documents, such as HTML pages and images, to reduce bandwidth usage, server load, and perceived lag...

) as well as information about the web pages, whereas others, such as AltaVista

AltaVista

AltaVista is a web search engine owned by Yahoo!. AltaVista was once one of the most popular search engines but its popularity declined with the rise of Google...

, store every word of every page they find. This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it. This problem might be considered to be a mild form of linkrot, and Google's handling of it increases usability

Usability

Usability is the ease of use and learnability of a human-made object. The object of use can be a software application, website, book, tool, machine, process, or anything a human interacts with. A usability study may be conducted as a primary job function by a usability analyst or as a secondary job...

by satisfying user expectations

User expectations

User expectations refers to the consistency that users expect from products. Interaction design is very concerned with this topic. For example, our user expectations for traffic behavior is one of the more consistent ones because it is governed by traffic laws that are enforced...

that the search terms will be on the returned webpage. This satisfies the principle of least astonishment

Principle of least astonishment

The principle of least astonishment applies to user interface design, software design, and ergonomics. It is alternatively referred to as the rule or law of least astonishment, or the rule or principle of least surprise .The POLA states that, when two elements of an interface conflict, or are...

since the user normally expects the search terms to be on the returned pages. Increased search relevance makes these cached pages very useful, even beyond the fact that they may contain data that may no longer be available elsewhere.

When a user enters a query

Web search query

A web search query is a query that a user enters into web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are unstructured and often ambiguous; they vary greatly from standard query languages which are governed by strict syntax rules.- Types...

into a search engine (typically by using key words

Keyword (Internet search)

An index term, subject term, subject heading, or descriptor, in information retrieval, is a term that captures the essence of the topic of a document. Index terms make up a controlled vocabulary for use in bibliographic records. They are an integral part of bibliographic control, which is the...

), the engine examines its index

Inverted index

In computer science, an inverted index is an index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents...

and provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes parts of the text. The index is built from the information stored with the data and the method by which the information is indexed. Unfortunately, there are currently no known public search engines that allow documents to be searched by date. Most search engines support the use of the boolean operators AND, OR and NOT to further specify the search query

Web search query

. Boolean operators are for literal searches that allow the user to refine and extend the terms of the search. The engine looks for the words or phrases exactly as entered. Some search engines provide an advanced feature called proximity search which allows users to define the distance between keywords. There is also concept-based searching where the research involves using statistical analysis on pages containing the words or phrases you search for. As well, natural language queries allow the user to type a question in the same form one would ask it to a human. A site like this would be ask.com.

The usefulness of a search engine depends on the relevance

Relevance (information retrieval)

In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user.-Types:...

of the result set it gives back. While there may be millions of web pages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are the best matches, and what order the results should be shown in, varies widely from one engine to another. The methods also change over time as Internet usage changes and new techniques evolve. There are two main types of search engine that have evolved: one is a system of predefined and hierarchically ordered keywords that humans have programmed extensively. The other is a system that generates an "inverted index

Inverted index

" by analyzing texts it locates. This second form relies much more heavily on the computer itself to do the bulk of the work.

Most Web search engines are commercial ventures supported by advertising

Advertising

Advertising is a form of communication used to persuade an audience to take some action with respect to products, ideas, or services. Most commonly, the desired result is to drive consumer behavior with respect to a commercial offering, although political and ideological advertising is also common...

revenue and, as a result, some employ the practice of allowing advertisers to pay money to have their listings ranked higher in search results. Those search engines which do not accept money for their search engine results make money by running search related ads

Contextual advertising

Contextual advertising is a form of targeted advertising for advertisements appearing on websites or other media, such as content displayed in mobile browsers...

alongside the regular search engine results. The search engines make money every time someone clicks on one of these ads.

Market share and wars

Search engine	\|Market share in December 2010
Google Google Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...
Yahoo! Yahoo! Yahoo! Inc. is an American multinational internet corporation headquartered in Sunnyvale, California, United States. The company is perhaps best known for its web portal, search engine , Yahoo! Directory, Yahoo! Mail, Yahoo! News, Yahoo! Groups, Yahoo! Answers, advertising, online mapping ,...
Baidu Baidu Baidu, Inc. , simply known as Baidu and incorporated on January 18, 2000, is a Chinese web services company headquartered in the Baidu Campus in Haidian District, Beijing, People's Republic of China....
Bing Bing Bing is a web search engine from Microsoft.Bing may also refer to:* An onomatopœia of a bell sound* Bing cherry, a variety of cherry* Bing , Chinese flatbread* Bing , a German company that manufactured toys and kitchen utensils...
Ask Ask.com Ask is a Q&A focused search engine founded in 1996 by Garrett Gruener and David Warthen in Berkeley, California. The original software was implemented by Gary Chevsky from his own design. Warthen, Chevsky, Justin Grant, and others built the early AskJeeves.com website around that core engine...
AOL AOL AOL Inc. is an American global Internet services and media company. AOL is headquartered at 770 Broadway in New York. Founded in 1983 as Control Video Corporation, it has franchised its services to companies in several nations around the world or set up international versions of its services...

Google's worldwide market share peaked at 86.3% in April 2010. Yahoo!

Yahoo!

, Bing

Bing

and other search engines are more popular in the US than in Europe

Europe

Europe is, by convention, one of the world's seven continents. Comprising the westernmost peninsula of Eurasia, Europe is generally 'divided' from Asia to its east by the watershed divides of the Ural and Caucasus Mountains, the Ural River, the Caspian and Black Seas, and the waterways connecting...

.

According to Hitwise

Hitwise

Experian Hitwise is a global online competitive intelligence service which collects data directly from ISP networks to aid website managers in analysing trends in visitor behavior and to measure website market share. The Hitwise product is a commercial platform whereby customers pay Hitwise a...

, market share in the U.S. for October 2011 was Google 65.38%, Bing-powered (Bing and Yahoo!) 28.62%, and the remaining 66 search engines 6%. However, the "success rate" of searches sampled in July. Over 80 percent of Yahoo! and Bing searches resulted in the users visiting a web site, while Google's rate was just under 68 percent.

An Experian Hitwise report released in August 2011 gave the "success rate" of searches sampled in July. Over 80 percent of Yahoo! and Bing searches resulted in the users visiting a web site, while Google's rate was just under 68 percent.

In the People's Republic of China

People's Republic of China

China , officially the People's Republic of China , is the most populous country in the world, with over 1.3 billion citizens. Located in East Asia, the country covers approximately 9.6 million square kilometres...

, Baidu held a 61.6% market share for web search in July 2009.

Search engine bias

Although search engines are programmed to rank websites based on their popularity and relevancy, empirical studies indicate various political, economic, and social biases in the information they provide. These biases could be a direct result of economic and commercial processes (e.g., companies that advertise with a search engine can become also more popular in its organic search

Organic search

Organic search results are listings on search engine results pages that appear because of their relevance to the search terms, as opposed to their being advertisements. In contrast, non-organic search results may include pay per click advertising....

results), and political processes (e.g., the removal of search results in order to comply with local laws). Google Bombing is one example of an attempt to manipulate search results for political, social or commercial reasons.

External links

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

History

How web search engines work

Market share and wars

Search engine bias

Further reading

External links