Desktop search
Encyclopedia
Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet. These tools are designed to find information on the user's PC, including web browser histories, e-mail archives, text documents, sound files, images and video.
One of the main advantages of desktop search programs is that search results arrive in a few seconds; Windows search companion "can be some help, but it searches through Windows files and folders only, not e-mail or contact databases, and unless you enable the Indexing Service (in Windows 2000 or XP), the Windows search tool is extremely slow." Windows Vista enables the indexing service by default.
A variety of desktop search programs are now available; see this list for examples.
Desktop search is emerging as a concern for large firms for two main reasons: untapped productivity and security. A commonly cited statistic states that 80% of a company's data is locked up inside unstructured data — the information stored on an end user's PC, the files and directories they've created on a network
, documents stored in repositories such as corporate intranet
s and a multitude of other locations. Moreover, many companies have structured or unstructured information stored in older file formats to which they don't have ready access.
Companies doing business in the United States
are frequently required under regulatory mandates like Sarbanes-Oxley, HIPAA
and FERPA to make sure that access to sensitive information is 100% controlled. This creates a challenge for IT organizations, which may not have a desktop search standard, or lack strict central control over end users downloading tools from the Internet
. Some consumer-oriented desktop search tools make it possible to generate indexes outside the corporate firewall
and share those indexes with unauthorized users. In some cases, end users are able to index — but not preview — items they should not even know exist.
Historically, full desktop search come from work of Apple Computer's Advanced Technology Group
, resulting in underlying AppleSearch
technology in early 1990s. It was used to build Sherlock
search engine and then developed into Spotlight
, which brought automated, non-timer based full indexing into operating system.
to achieve reasonable performance when searching several gigabyte
s of data
. Indexing usually takes place when the computer is idle and most search applications can be set to suspend it if a portable computer is running on batteries, in order to save power. When indexing the files, desktop search tools collect three types of information about files:
Besides programs that use indexing, there are many programs that open and search files instantly. Their disadvantage is that they can search only a certain directory, not the entire computer, but their great advantage is that they do not load the resources of computer with indexing. Furthermore, they always use the current status of the documents.
To search within documents, the tools need to be able to parse many different types of documents. This is achieved by using filters that interpret selected file formats. For example, a Microsoft Office Filter might be used to search inside Microsoft Office
documents.
Long-term goals for desktop search include the ability to search the contents of image files, sound files and video by context.
The sector attracted considerable attention from the struggle between Microsoft and Google. According to market analysts, both companies are attempting to leverage their monopolies (of web browser
s and search engine
s, respectively) to strengthen their dominance. Due to Google
's complaint that users of Windows Vista can not choose any competitor's desktop search program over the built-in one, an agreement was reached between US Justice Department and Microsoft
that Windows Vista Service Pack 1 will enable users to choose between the built-in and other desktop search programs, and select which one is to be the default.
One of the main advantages of desktop search programs is that search results arrive in a few seconds; Windows search companion "can be some help, but it searches through Windows files and folders only, not e-mail or contact databases, and unless you enable the Indexing Service (in Windows 2000 or XP), the Windows search tool is extremely slow." Windows Vista enables the indexing service by default.
A variety of desktop search programs are now available; see this list for examples.
Desktop search is emerging as a concern for large firms for two main reasons: untapped productivity and security. A commonly cited statistic states that 80% of a company's data is locked up inside unstructured data — the information stored on an end user's PC, the files and directories they've created on a network
Computer network
A computer network, often simply referred to as a network, is a collection of hardware components and computers interconnected by communication channels that allow sharing of resources and information....
, documents stored in repositories such as corporate intranet
Intranet
An intranet is a computer network that uses Internet Protocol technology to securely share any part of an organization's information or network operating system within that organization. The term is used in contrast to internet, a network between organizations, and instead refers to a network...
s and a multitude of other locations. Moreover, many companies have structured or unstructured information stored in older file formats to which they don't have ready access.
Companies doing business in the United States
United States
The United States of America is a federal constitutional republic comprising fifty states and a federal district...
are frequently required under regulatory mandates like Sarbanes-Oxley, HIPAA
Health Insurance Portability and Accountability Act
The Health Insurance Portability and Accountability Act of 1996 was enacted by the U.S. Congress and signed by President Bill Clinton in 1996. It was originally sponsored by Sen. Edward Kennedy and Sen. Nancy Kassebaum . Title I of HIPAA protects health insurance coverage for workers and their...
and FERPA to make sure that access to sensitive information is 100% controlled. This creates a challenge for IT organizations, which may not have a desktop search standard, or lack strict central control over end users downloading tools from the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
. Some consumer-oriented desktop search tools make it possible to generate indexes outside the corporate firewall
Firewall (computing)
A firewall is a device or set of devices designed to permit or deny network transmissions based upon a set of rules and is frequently used to protect networks from unauthorized access while permitting legitimate communications to pass....
and share those indexes with unauthorized users. In some cases, end users are able to index — but not preview — items they should not even know exist.
Historically, full desktop search come from work of Apple Computer's Advanced Technology Group
Advanced Technology Group
The Advanced Technology Group was a corporate research laboratory at Apple Computer from 1986 to 1997. ATG was started by Larry Tesler in October 1986 to study long term research into future technologies that were beyond the time frame or organizational scope of any individual product group. Over...
, resulting in underlying AppleSearch
AppleSearch
AppleSearch was a client/server search engine from Apple Computer, first released for the "classic" Mac OS in 1994. AppleSearch appears to have seen little use in its original form, although much of the crawling, indexing and searching code was apparently re-used almost verbatim in Sherlock,...
technology in early 1990s. It was used to build Sherlock
Sherlock (software)
Sherlock, named after Sherlock Holmes, was a file and web search tool created by Apple Inc. for the Mac OS, introduced with Mac OS 8.5 as an extension of the Mac OS Finder's file searching capabilities. Like its predecessor, it can search for local files and file contents, which it does using the...
search engine and then developed into Spotlight
Spotlight (software)
Spotlight is a system-wide desktop search feature of Apple's Mac OS X operating system. Spotlight is a selection-based search system, which creates a virtual index of all items and files on the system. It is designed to allow the user to quickly locate a wide variety of items on the computer,...
, which brought automated, non-timer based full indexing into operating system.
Technologies
Desktop search engines build and maintain an index databaseIndex (search engine)
Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and computer science...
to achieve reasonable performance when searching several gigabyte
Gigabyte
The gigabyte is a multiple of the unit byte for digital information storage. The prefix giga means 109 in the International System of Units , therefore 1 gigabyte is...
s of data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
. Indexing usually takes place when the computer is idle and most search applications can be set to suspend it if a portable computer is running on batteries, in order to save power. When indexing the files, desktop search tools collect three types of information about files:
- file and directory names
- metadataMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
, such as titles, authors, comments in file types such as MP3MP3MPEG-1 or MPEG-2 Audio Layer III, more commonly referred to as MP3, is a patented digital audio encoding format using a form of lossy data compression...
, PDFPortable Document FormatPortable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
and JPEGJPEGIn computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality.... - content of supported documents.
Besides programs that use indexing, there are many programs that open and search files instantly. Their disadvantage is that they can search only a certain directory, not the entire computer, but their great advantage is that they do not load the resources of computer with indexing. Furthermore, they always use the current status of the documents.
To search within documents, the tools need to be able to parse many different types of documents. This is achieved by using filters that interpret selected file formats. For example, a Microsoft Office Filter might be used to search inside Microsoft Office
Microsoft Office
Microsoft Office is a non-free commercial office suite of inter-related desktop applications, servers and services for the Microsoft Windows and Mac OS X operating systems, introduced by Microsoft in August 1, 1989. Initially a marketing term for a bundled set of applications, the first version of...
documents.
Long-term goals for desktop search include the ability to search the contents of image files, sound files and video by context.
The sector attracted considerable attention from the struggle between Microsoft and Google. According to market analysts, both companies are attempting to leverage their monopolies (of web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...
s and search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...
s, respectively) to strengthen their dominance. Due to Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...
's complaint that users of Windows Vista can not choose any competitor's desktop search program over the built-in one, an agreement was reached between US Justice Department and Microsoft
Microsoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
that Windows Vista Service Pack 1 will enable users to choose between the built-in and other desktop search programs, and select which one is to be the default.
External links
- Keeper Finders, by Paul Boutin, SlateSlate (magazine)Slate is a US-based English language online current affairs and culture magazine created in 1996 by former New Republic editor Michael Kinsley, initially under the ownership of Microsoft as part of MSN. On 21 December 2004 it was purchased by the Washington Post Company...
, December 31, 2004 — A comparison of Google, Ask Jeeves, HotBot, MSN and Copernic desktop search tools. - Evaluation of desktop search applications, A detailed evaluation of Google Desktop, Copernic Desktop Search, Yahoo! Desktop Search, Windows Desktop Search and ISYS:Desktop; dated 2006.
- GoebelGroup.com's desktop search tools comparison chart - Date of last update: 15 January 2007.
- A detailed comparison of desktop search tools - dated 2004.
- Comparison of desktop search software - Date of last update: March 2008