Astrophysics Data System
Encyclopedia
The Astrophysics Data System (usually referred to as ADS), developed by the National Aeronautics and Space Administration (NASA), is an online database of over eight million astronomy
and physics
papers from both peer review
ed and non-peer reviewed sources. Abstracts
are available free online for almost all articles, and full scanned articles are available in Graphics Interchange Format (GIF) and Portable Document Format
(PDF) for older articles. New articles have links to electronic versions hosted at the journal's webpage, but these are typically available only by subscription (which most astronomy research facilities have). It is managed by the Harvard–Smithsonian Center for Astrophysics.
ADS is a powerful research tool and has had a significant impact on the efficiency of astronomical research since it was launched in 1992. Literature searches that previously would have taken days or weeks can now be carried out in seconds via the ADS search engine, custom-built for astronomical needs. Studies have found that the benefit to astronomy of the ADS is equivalent to several hundred million US dollars annually, and the system is estimated to have tripled the readership of astronomical journals.
Use of ADS is almost universal among astronomers worldwide, and therefore ADS usage statistics can be used to analyze global trends in astronomical research. These studies have revealed that the amount of research an astronomer carries out is related to the per capita gross domestic product
(GDP) of the country in which he/she is based, and that the number of astronomers in a country is proportional to the GDP of that country, so the total amount of research done in a country is proportional to the square of its GDP divided by its population.
could eventually be used to build an electronic indexing system of astronomical research papers which would allow astronomers to keep abreast of a much greater range of research.
The first suggestion of a database of journal paper abstracts was made at a conference on Astronomy from Large Data-bases held in Garching bei München
in 1987. Initial development of an electronic system for accessing astrophysical abstracts took place during the following two years; in 1991 discussions took place on how to integrate ADS with the SIMBAD
database, containing all available catalog designations for objects outside the solar system
, to create a system where astronomers could search for all the papers written about a given object.
An initial version of ADS, with a database consisting of 40 papers, was created as a proof of concept
in 1988, and the ADS database was successfully connected with the SIMBAD database in the summer of 1993. The creators believed this was the first use of the Internet to allow simultaneous querying of transatlantic scientific databases. Until 1994, the service was available via proprietary network software, but it was transferred to the nascent World Wide Web
early that year. The number of users of the service quadrupled in the five weeks following the introduction of the ADS web-based service.
At first, the journal articles available via ADS were scan
ned bitmap
s created from the paper journals, but from 1995 onwards, the Astrophysical Journal
began to publish an on-line edition, soon followed by the other main journals such as Astronomy and Astrophysics
and the Monthly Notices of the Royal Astronomical Society
. ADS provided links to these electronic editions from their first appearance. Since about 1995, the number of ADS users has doubled roughly every two years. ADS now has agreements with almost all astronomical journals, who supply abstracts. Scanned articles from as far back as the early 19th century are available via the service, which now contains over eight million documents. The service is distributed worldwide, with twelve mirror sites in twelve countries on five continents, with the database synchronized by means of weekly updates using rsync
, a mirroring utility which allows updates to only the portions of the database which have changed. All updates are triggered centrally, but they initiate scripts at the mirror sites which "pull" updated data from the main ADS servers.
, such as author lists, reference
s and citation
s. Originally this data was stored in ASCII
format, but eventually the limitations of this encouraged the database maintainers to migrate all records to an XML
(Extensible Markup Language) format in 2000. Bibliographic records are now stored as an XML element, with sub-elements for the various metadata.
Since the advent of online editions of journals, abstracts are loaded into the ADS on or before the publication date of articles, with the full journal text available to subscribers. Older articles have been scanned, and an abstract is created using optical character recognition
software. Scanned articles from before about 1995 are usually available free, by agreement with the journal publishers.
Scanned articles are stored in TIFF format, at both medium and high resolution
. The TIFF files are converted on demand into GIF files for on-screen viewing, and PDF or PostScript
files for printing. The generated files are then cache
d to eliminate needlessly frequent regenerations for popular articles. As of 2000, ADS contained 250 GB
of scans, which consisted of 1,128,955 article pages comprising 138,789 articles. By 2005 this had grown to 650 GB, and is expected to grow further, to about 900 GB by 2007. No further information has been published.
The database initially contained only astronomical references, but has now grown to incorporate three databases, covering astronomy
(including planetary sciences and solar physics) references, physics
(including instrumentation and geosciences) references, as well as preprints of scientific papers from arXiv
. The astronomy database is by far the most advanced and its use accounts for about 85% of the total ADS usage. Articles are assigned to the different databases according to the subject rather than the journal they are published in, so that articles from any one journal might appear in all three subject databases. The separation of the databases allows searching in each discipline to be tailored, so that words can automatically be given different weight function
s in different database searches, depending on how common they are in the relevant field.
Data in the preprint archive is updated daily from the arXiv
, the main repository of physics and astronomy preprints. The advent of preprint servers has, like ADS, had a significant impact on the rate of astronomical research, as papers are often made available from preprint servers weeks or months before they are published in the journals. The incorporation of preprints from the arXiv into ADS means that the search engine can return the most current research available, with the caveat that preprints may not have been peer reviewed or proofread to the required standard for publication in the main journals. ADS's database links preprints with subsequently published articles wherever possible, so that citation and reference searches will return links to the journal article where the preprint was cited.
software. The scripts are designed to be as platform independent as possible, given the need to facilitate mirroring on different systems around the world, although the growing use of Linux
as the operating system
of choice within astronomy has led to increasing optimization of the scripts for installation on that platform.
The main ADS server is located at the Harvard-Smithsonian Center for Astrophysics in Cambridge, Massachusetts
, and is a dual 64-bit X86 Intel server with two quad-core 3.0 GHz
CPU
s and 32 GB of RAM, running the CentOS
5.4 Linux
distribution. Mirrors are located in Brazil, China, Chile, France, Germany, India, Indonesia, Japan, Russia, South Korea, United Kingdom, and the Ukraine.
and LaTeX
by almost all scientific journals greatly facilitates the incorporation of bibliographic data into the system in a standardized format, and importing HTML
-coded web-based articles is also simple. ADS utilizes Perl
scripts for importing, processing and standardizing bibliographic data.
The apparently mundane task of converting author names into a standard Surname
, Initial format is actually one of the more difficult to automate, due to the wide variety of naming conventions around the world and the possibility that a given name such as Davis could be a first name, middle name
or surname. The accurate conversion of names requires a detailed knowledge of the names of authors active in astronomy, and ADS maintains an extensive database of author names, which is also used in searching the database (see below).
For electronic articles, a list of the references given at the end of the article is easily extracted. For scanned articles, reference extraction relies on OCR. The reference database can then be "inverted" to list the citations for each paper in the database. Citation lists have been used in the past to identify popular articles missing from the database; mostly these were from before 1975 and have now been added to the system.
, Astronomical Journal
, Astronomy and Astrophysics
, Publications of the Astronomical Society of the Pacific
and the Monthly Notices of the Royal Astronomical Society
), coverage is complete, with all issues indexed from number 1 to the present. These journals account for about two-thirds of the papers in the database, with the rest consisting of papers published in over 100 other journals from around the world.
While the database contains the complete contents of all the major journals and many minor ones as well, its coverage of references and citations is much less complete. References in and citations of articles in the major journals are fairly complete, but references such as "private communication", "in press" or "in preparation" cannot be matched, and author errors in reference listings also introduce potential errors. Astronomical papers may cite and be cited by articles in journals which fall outside the scope of ADS, such as chemistry
, mathematics
or biology
journals.
assume that the user is well-versed in astronomy and able to interpret search results which are designed to return more than just the most relevant papers. The database can be queried for author names, astronomical object
names, title words, and words in the abstract text, and results can be filtered according to a number of criteria. It works by first gathering synonyms and simplifying search terms as described above, and then generating an "inverted file", which is a list of all the documents matching each search term. The user-selected logic and filters are then applied to this inverted list to generate the final search results.
s and transliterations from Arabic
or Cyrillic
script. An example of an entry in the author synonym list is:
, the NASA/IPAC Extragalactic Database
, the International Astronomical Union
Circulars and the Lunar and Planetary Institute
to identify papers referring to a given object, and can also search by object position, listing papers which concern objects within a 10 arcminute radius of a given Right Ascension
and Declination
. These databases combine the many catalogue designations an object might have, so that a search for the Pleiades
will also find papers which list the famous open cluster
in Taurus
under any of its other catalog designations or popular names, such as M45, the Seven Sisters or Melotte 22.
has the space or hyphen removed, so that searching for Messier catalogue objects is simplified and a user input of M45, M 45 or M-45 all result in the same query being executed; similarly, NGC
designations and common search terms such as Shoemaker Levy
and T Tauri
are stripped of spaces. Unimportant words such as AT, OR and TO are stripped out, although in some cases case sensitivity
is maintained, so that while and is ignored, And is converted to "Andromeda
e", and Her is converted to "Herculis
", but her is ignored.
replacement such as searching for both plural
and singular
forms, ADS also searches for a large number of specifically astronomical synonyms. For example, spectrograph
and spectroscope have basically the same meaning, and in an astronomical context metallicity
and abundance
are also synonymous. ADS's synonym list was created manually, by grouping the list of words in the database according to similar meanings.
As well as English language
synonyms, ADS also searches for English translations of foreign search terms and vice versa, so that a search for the French
word soleil retrieves references to Sun
, and papers in languages other than English can be returned by English search terms.
Synonym replacement can be disabled if required, so that a rare term which is a synonym of a much more common term (such as 'dateline
' rather than 'date
') can be searched for specifically.
both within fields and between fields. Search terms in each field can be combined with OR, AND, simple logic or Boolean logic
, and the user can specify which fields must be matched in the search results. This allows complex searches to be built; for example, the user could search for papers concerning NGC 6543 OR NGC 7009, with the paper titles containing (radius OR velocity) AND NOT (abundance OR temperature).
proceedings can be excluded or specifically searched for, or specific journals can be included in or excluded from the search.
Also returned are links to the SIMBAD and/or NASA Extragalactic Database object name databases, via which a user can quickly find out basic observational data about the objects analyzed in a paper, and find further papers on those objects.
In monetary terms, this increase in efficiency represents a considerable amount. There are about 12,000 active astronomical researchers worldwide, so ADS is the equivalent of about 5% of the working population of astronomers. The global astronomical research budget is estimated at between 4,000 and 5,000 million USD, so the value of ADS to astronomy would be about 200–250 million USD annually. Its operating budget is a small fraction of this amount.
The great importance of ADS to astronomers has been recognized by the United Nations
, the General Assembly
of which has commended ADS on its work and success, particularly noting its importance to astronomers in the developing world, in reports of the United Nations Committee on the Peaceful Uses of Outer Space
. A 2002 report by a visiting committee to the Center for Astrophysics, meanwhile, said that the service had "revolutionized the use of the astronomical literature", and was "probably the most valuable single contribution to astronomy research that the CfA has made in its lifetime".
can easily be used to determine the user's geographical location. Studies reveal that the highest per-capita users of ADS are France and Netherlands
-based astronomers, and while more developed countries (measured by GDP per capita) use the system more than less developed countries; the relationship between GDP per capita and ADS use is not linear. The range of ADS usage per capita far exceeds the range of GDPs per capita, and basic research carried out in a country, as measured by ADS usage, has been found to be proportional to the square of the country's GDP divided by its population.
ADS usage statistics also suggest that astronomers in more developed countries tend to be more productive than those in less developed countries. The amount of basic research carried out is proportional to the number of astronomers in a country multiplied by the GDP per capita. Statistics also imply that astronomers in Europe
an cultures carry out about three times as much research as those in Asian culture
s, perhaps suggesting cultural differences in the importance attached to astronomical research.
ADS has also been used to show that the fraction of single-author astronomy papers has decreased substantially since 1975 and that astronomical papers with more than 50 authors have become more common since 1990.
Astronomy
Astronomy is a natural science that deals with the study of celestial objects and phenomena that originate outside the atmosphere of Earth...
and physics
Physics
Physics is a natural science that involves the study of matter and its motion through spacetime, along with related concepts such as energy and force. More broadly, it is the general analysis of nature, conducted in order to understand how the universe behaves.Physics is one of the oldest academic...
papers from both peer review
Peer review
Peer review is a process of self-regulation by a profession or a process of evaluation involving qualified individuals within the relevant field. Peer review methods are employed to maintain standards, improve performance and provide credibility...
ed and non-peer reviewed sources. Abstracts
Abstract (summary)
An abstract is a brief summary of a research article, thesis, review, conference proceeding or any in-depth analysis of a particular subject or discipline, and is often used to help the reader quickly ascertain the paper's purpose. When used, an abstract always appears at the beginning of a...
are available free online for almost all articles, and full scanned articles are available in Graphics Interchange Format (GIF) and Portable Document Format
Portable Document Format
Portable Document Format is an open standard for document exchange. This file format, created by Adobe Systems in 1993, is used for representing documents in a manner independent of application software, hardware, and operating systems....
(PDF) for older articles. New articles have links to electronic versions hosted at the journal's webpage, but these are typically available only by subscription (which most astronomy research facilities have). It is managed by the Harvard–Smithsonian Center for Astrophysics.
ADS is a powerful research tool and has had a significant impact on the efficiency of astronomical research since it was launched in 1992. Literature searches that previously would have taken days or weeks can now be carried out in seconds via the ADS search engine, custom-built for astronomical needs. Studies have found that the benefit to astronomy of the ADS is equivalent to several hundred million US dollars annually, and the system is estimated to have tripled the readership of astronomical journals.
Use of ADS is almost universal among astronomers worldwide, and therefore ADS usage statistics can be used to analyze global trends in astronomical research. These studies have revealed that the amount of research an astronomer carries out is related to the per capita gross domestic product
Gross domestic product
Gross domestic product refers to the market value of all final goods and services produced within a country in a given period. GDP per capita is often considered an indicator of a country's standard of living....
(GDP) of the country in which he/she is based, and that the number of astronomers in a country is proportional to the GDP of that country, so the total amount of research done in a country is proportional to the square of its GDP divided by its population.
History
For many years, a growing problem in astronomical research (as in other academic disciplines) was that the number of papers published in the major astronomical journals was increasing steadily, meaning astronomers were able to read less and less of the latest research findings. During the 1980s, astronomers saw that the nascent technologies which formed the basis of the InternetInternet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
could eventually be used to build an electronic indexing system of astronomical research papers which would allow astronomers to keep abreast of a much greater range of research.
The first suggestion of a database of journal paper abstracts was made at a conference on Astronomy from Large Data-bases held in Garching bei München
Garching bei München
Garching bei München or Garching is a city in Bavaria, Germany near Munich. It is the home of several research institutes and university departments. It became a city on 14 September 1990.-Location:...
in 1987. Initial development of an electronic system for accessing astrophysical abstracts took place during the following two years; in 1991 discussions took place on how to integrate ADS with the SIMBAD
SIMBAD
SIMBAD is an astronomical database of objects beyond the Solar System...
database, containing all available catalog designations for objects outside the solar system
Solar System
The Solar System consists of the Sun and the astronomical objects gravitationally bound in orbit around it, all of which formed from the collapse of a giant molecular cloud approximately 4.6 billion years ago. The vast majority of the system's mass is in the Sun...
, to create a system where astronomers could search for all the papers written about a given object.
An initial version of ADS, with a database consisting of 40 papers, was created as a proof of concept
Proof of concept
A proof of concept or a proof of principle is a realization of a certain method or idea to demonstrate its feasibility, or a demonstration in principle, whose purpose is to verify that some concept or theory that has the potential of being used...
in 1988, and the ADS database was successfully connected with the SIMBAD database in the summer of 1993. The creators believed this was the first use of the Internet to allow simultaneous querying of transatlantic scientific databases. Until 1994, the service was available via proprietary network software, but it was transferred to the nascent World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
early that year. The number of users of the service quadrupled in the five weeks following the introduction of the ADS web-based service.
At first, the journal articles available via ADS were scan
Image scanner
In computing, an image scanner—often abbreviated to just scanner—is a device that optically scans images, printed text, handwriting, or an object, and converts it to a digital image. Common examples found in offices are variations of the desktop scanner where the document is placed on a glass...
ned bitmap
Bitmap
In computer graphics, a bitmap or pixmap is a type of memory organization or image file format used to store digital images. The term bitmap comes from the computer programming terminology, meaning just a map of bits, a spatially mapped array of bits. Now, along with pixmap, it commonly refers to...
s created from the paper journals, but from 1995 onwards, the Astrophysical Journal
Astrophysical Journal
The Astrophysical Journal is a peer-reviewed scientific journal covering astronomy and astrophysics. It was founded in 1895 by the American astronomers George Ellery Hale and James Edward Keeler. It publishes three 500-page issues per month....
began to publish an on-line edition, soon followed by the other main journals such as Astronomy and Astrophysics
Astronomy and Astrophysics
* Astronomy and Astrophysics has a 2010 impact factor of 4.410.-See also:*The Astronomy and Astrophysics Review*Advanced Satellite for Cosmology and Astrophysics*Astronomy & Geophysics*Astronomical Journal*Astrophysical Journal...
and the Monthly Notices of the Royal Astronomical Society
Monthly Notices of the Royal Astronomical Society
Monthly Notices of the Royal Astronomical Society is one of the world's leading scientific journals in astronomy and astrophysics. It has been in continuous existence since 1827 and publishes peer-reviewed letters and papers reporting original research in relevant fields...
. ADS provided links to these electronic editions from their first appearance. Since about 1995, the number of ADS users has doubled roughly every two years. ADS now has agreements with almost all astronomical journals, who supply abstracts. Scanned articles from as far back as the early 19th century are available via the service, which now contains over eight million documents. The service is distributed worldwide, with twelve mirror sites in twelve countries on five continents, with the database synchronized by means of weekly updates using rsync
Rsync
rsync is a software application and network protocol for Unix-like and Windows systems which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding when appropriate. An important feature of rsync not found in most similar...
, a mirroring utility which allows updates to only the portions of the database which have changed. All updates are triggered centrally, but they initiate scripts at the mirror sites which "pull" updated data from the main ADS servers.
Data in the system
Papers are indexed within the database by their bibliographic record, containing the details of the journal they were published in and various associated metadataMetadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
, such as author lists, reference
Image resolution
Image resolution is an umbrella term that describes the detail an image holds. The term applies to raster digital images, film images, and other types of images. Higher resolution means more image detail....
s and citation
Citation
Broadly, a citation is a reference to a published or unpublished source . More precisely, a citation is an abbreviated alphanumeric expression Broadly, a citation is a reference to a published or unpublished source (not always the original source). More precisely, a citation is an abbreviated...
s. Originally this data was stored in ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
format, but eventually the limitations of this encouraged the database maintainers to migrate all records to an XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
(Extensible Markup Language) format in 2000. Bibliographic records are now stored as an XML element, with sub-elements for the various metadata.
Since the advent of online editions of journals, abstracts are loaded into the ADS on or before the publication date of articles, with the full journal text available to subscribers. Older articles have been scanned, and an abstract is created using optical character recognition
Optical character recognition
Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping...
software. Scanned articles from before about 1995 are usually available free, by agreement with the journal publishers.
Scanned articles are stored in TIFF format, at both medium and high resolution
Image resolution
Image resolution is an umbrella term that describes the detail an image holds. The term applies to raster digital images, film images, and other types of images. Higher resolution means more image detail....
. The TIFF files are converted on demand into GIF files for on-screen viewing, and PDF or PostScript
PostScript
PostScript is a dynamically typed concatenative programming language created by John Warnock and Charles Geschke in 1982. It is best known for its use as a page description language in the electronic and desktop publishing areas. Adobe PostScript 3 is also the worldwide printing and imaging...
files for printing. The generated files are then cache
Cache
In computer engineering, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere...
d to eliminate needlessly frequent regenerations for popular articles. As of 2000, ADS contained 250 GB
Gigabyte
The gigabyte is a multiple of the unit byte for digital information storage. The prefix giga means 109 in the International System of Units , therefore 1 gigabyte is...
of scans, which consisted of 1,128,955 article pages comprising 138,789 articles. By 2005 this had grown to 650 GB, and is expected to grow further, to about 900 GB by 2007. No further information has been published.
The database initially contained only astronomical references, but has now grown to incorporate three databases, covering astronomy
Astronomy
Astronomy is a natural science that deals with the study of celestial objects and phenomena that originate outside the atmosphere of Earth...
(including planetary sciences and solar physics) references, physics
Physics
Physics is a natural science that involves the study of matter and its motion through spacetime, along with related concepts such as energy and force. More broadly, it is the general analysis of nature, conducted in order to understand how the universe behaves.Physics is one of the oldest academic...
(including instrumentation and geosciences) references, as well as preprints of scientific papers from arXiv
ArXiv
The arXiv |Chi]], χ) is an archive for electronic preprints of scientific papers in the fields of mathematics, physics, astronomy, computer science, quantitative biology, statistics, and quantitative finance which can be accessed online. In many fields of mathematics and physics, almost all...
. The astronomy database is by far the most advanced and its use accounts for about 85% of the total ADS usage. Articles are assigned to the different databases according to the subject rather than the journal they are published in, so that articles from any one journal might appear in all three subject databases. The separation of the databases allows searching in each discipline to be tailored, so that words can automatically be given different weight function
Weight function
A weight function is a mathematical device used when performing a sum, integral, or average in order to give some elements more "weight" or influence on the result than other elements in the same set. They occur frequently in statistics and analysis, and are closely related to the concept of a...
s in different database searches, depending on how common they are in the relevant field.
Data in the preprint archive is updated daily from the arXiv
ArXiv
The arXiv |Chi]], χ) is an archive for electronic preprints of scientific papers in the fields of mathematics, physics, astronomy, computer science, quantitative biology, statistics, and quantitative finance which can be accessed online. In many fields of mathematics and physics, almost all...
, the main repository of physics and astronomy preprints. The advent of preprint servers has, like ADS, had a significant impact on the rate of astronomical research, as papers are often made available from preprint servers weeks or months before they are published in the journals. The incorporation of preprints from the arXiv into ADS means that the search engine can return the most current research available, with the caveat that preprints may not have been peer reviewed or proofread to the required standard for publication in the main journals. ADS's database links preprints with subsequently published articles wherever possible, so that citation and reference searches will return links to the journal article where the preprint was cited.
Software and hardware
The software runs on a system that was written specifically for it, allowing for extensive customization for astronomical needs that would not have been possible with general purpose databaseDatabase
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
software. The scripts are designed to be as platform independent as possible, given the need to facilitate mirroring on different systems around the world, although the growing use of Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
as the operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...
of choice within astronomy has led to increasing optimization of the scripts for installation on that platform.
The main ADS server is located at the Harvard-Smithsonian Center for Astrophysics in Cambridge, Massachusetts
Cambridge, Massachusetts
Cambridge is a city in Middlesex County, Massachusetts, United States, in the Greater Boston area. It was named in honor of the University of Cambridge in England, an important center of the Puritan theology embraced by the town's founders. Cambridge is home to two of the world's most prominent...
, and is a dual 64-bit X86 Intel server with two quad-core 3.0 GHz
GHZ
GHZ or GHz may refer to:# Gigahertz .# Greenberger-Horne-Zeilinger state — a quantum entanglement of three particles.# Galactic Habitable Zone — the region of a galaxy that is favorable to the formation of life....
CPU
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...
s and 32 GB of RAM, running the CentOS
CentOS
CentOS is a free operating system based on Red Hat Enterprise Linux . It exists to provide a free enterprise class computing platform and strives to maintain 100% binary compatibility with its upstream distribution...
5.4 Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...
distribution. Mirrors are located in Brazil, China, Chile, France, Germany, India, Indonesia, Japan, Russia, South Korea, United Kingdom, and the Ukraine.
Indexing
ADS currently receives abstracts or tables of contents from almost two hundred journal sources. The service may receive data referring to the same article from multiple sources, and creates one bibliographic reference based on the most accurate data from each source. The common use of TeXTeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....
and LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...
by almost all scientific journals greatly facilitates the incorporation of bibliographic data into the system in a standardized format, and importing HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....
-coded web-based articles is also simple. ADS utilizes Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...
scripts for importing, processing and standardizing bibliographic data.
The apparently mundane task of converting author names into a standard Surname
Surname
A surname is a name added to a given name and is part of a personal name. In many cases, a surname is a family name. Many dictionaries define "surname" as a synonym of "family name"...
, Initial format is actually one of the more difficult to automate, due to the wide variety of naming conventions around the world and the possibility that a given name such as Davis could be a first name, middle name
Middle name
People's names in several cultures include one or more additional names placed between the first given name and the surname. In Canada and the United States all such names are specifically referred to as middle name; in most European countries they would simply be regarded as second, third, etc....
or surname. The accurate conversion of names requires a detailed knowledge of the names of authors active in astronomy, and ADS maintains an extensive database of author names, which is also used in searching the database (see below).
For electronic articles, a list of the references given at the end of the article is easily extracted. For scanned articles, reference extraction relies on OCR. The reference database can then be "inverted" to list the citations for each paper in the database. Citation lists have been used in the past to identify popular articles missing from the database; mostly these were from before 1975 and have now been added to the system.
Coverage
The database now contains over eight million articles. In the cases of the major journals of astronomy (Astrophysical JournalAstrophysical Journal
The Astrophysical Journal is a peer-reviewed scientific journal covering astronomy and astrophysics. It was founded in 1895 by the American astronomers George Ellery Hale and James Edward Keeler. It publishes three 500-page issues per month....
, Astronomical Journal
Astronomical Journal
The Astronomical Journal is a peer-reviewed monthly scientific journal owned by the American Astronomical Society and currently published by Institute of Physics Publishing. It is one of the premier journals for astronomy in the world...
, Astronomy and Astrophysics
Astronomy and Astrophysics
* Astronomy and Astrophysics has a 2010 impact factor of 4.410.-See also:*The Astronomy and Astrophysics Review*Advanced Satellite for Cosmology and Astrophysics*Astronomy & Geophysics*Astronomical Journal*Astrophysical Journal...
, Publications of the Astronomical Society of the Pacific
Publications of the Astronomical Society of the Pacific
Publications of the Astronomical Society of the Pacific is a monthly scientific journal which publishes astronomy research and review papers, instrumentation papers and dissertation summaries....
and the Monthly Notices of the Royal Astronomical Society
Monthly Notices of the Royal Astronomical Society
Monthly Notices of the Royal Astronomical Society is one of the world's leading scientific journals in astronomy and astrophysics. It has been in continuous existence since 1827 and publishes peer-reviewed letters and papers reporting original research in relevant fields...
), coverage is complete, with all issues indexed from number 1 to the present. These journals account for about two-thirds of the papers in the database, with the rest consisting of papers published in over 100 other journals from around the world.
While the database contains the complete contents of all the major journals and many minor ones as well, its coverage of references and citations is much less complete. References in and citations of articles in the major journals are fairly complete, but references such as "private communication", "in press" or "in preparation" cannot be matched, and author errors in reference listings also introduce potential errors. Astronomical papers may cite and be cited by articles in journals which fall outside the scope of ADS, such as chemistry
Chemistry
Chemistry is the science of matter, especially its chemical reactions, but also its composition, structure and properties. Chemistry is concerned with atoms and their interactions with other atoms, and particularly with the properties of chemical bonds....
, mathematics
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...
or biology
Biology
Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, origin, evolution, distribution, and taxonomy. Biology is a vast subject containing many subdivisions, topics, and disciplines...
journals.
Search engine
Since its inception, the ADS has developed a highly complex search engine to query the abstract and object databases. The search engine is tailor-made for searching astronomical abstracts, and the engine and its user interfaceUser interface
The user interface, in the industrial design field of human–machine interaction, is the space where interaction between humans and machines occurs. The goal of interaction between a human and a machine at the user interface is effective operation and control of the machine, and feedback from the...
assume that the user is well-versed in astronomy and able to interpret search results which are designed to return more than just the most relevant papers. The database can be queried for author names, astronomical object
Astronomical object
Astronomical objects or celestial objects are naturally occurring physical entities, associations or structures that current science has demonstrated to exist in the observable universe. The term astronomical object is sometimes used interchangeably with astronomical body...
names, title words, and words in the abstract text, and results can be filtered according to a number of criteria. It works by first gathering synonyms and simplifying search terms as described above, and then generating an "inverted file", which is a list of all the documents matching each search term. The user-selected logic and filters are then applied to this inverted list to generate the final search results.
Author name queries
The system indexes author names by surname and initials, and accounts for the possible variations in spelling of names using a list of variations. This is common in the case of names including accents such as umlautUmlaut (diacritic)
The diaeresis and the umlaut are diacritics that consist of two dots placed over a letter, most commonly a vowel. When that letter is an i or a j, the diacritic replaces the tittle: ï....
s and transliterations from Arabic
Arabic alphabet
The Arabic alphabet or Arabic abjad is the Arabic script as it is codified for writing the Arabic language. It is written from right to left, in a cursive style, and includes 28 letters. Because letters usually stand for consonants, it is classified as an abjad.-Consonants:The Arabic alphabet has...
or Cyrillic
Cyrillic alphabet
The Cyrillic script or azbuka is an alphabetic writing system developed in the First Bulgarian Empire during the 10th century AD at the Preslav Literary School...
script. An example of an entry in the author synonym list is:
- AFANASJEV, V
- AFANAS’EV, V
- AFANAS’IEV, V
- AFANASEV, V
- AFANASYEV, V
- AFANS’IEV, V
- AFANSEV, V
Object name searches
The capability to search for papers on specific astronomical objects is one of ADS's most powerful tools. The system uses data from the SIMBADSIMBAD
SIMBAD is an astronomical database of objects beyond the Solar System...
, the NASA/IPAC Extragalactic Database
NASA/IPAC Extragalactic Database
The NASA/IPAC Extragalactic Database is an on-line astronomical database for astronomers that collates and cross-correlates astronomical information on extragalactic objects...
, the International Astronomical Union
International Astronomical Union
The International Astronomical Union IAU is a collection of professional astronomers, at the Ph.D. level and beyond, active in professional research and education in astronomy...
Circulars and the Lunar and Planetary Institute
Lunar and Planetary Institute
The Lunar and Planetary Institute is a scientific research institute dedicated to study of the solar system, its formation, evolution, and current state. The Institute is part of the Universities Space Research Association and is supported by the Science Mission Directorate of the National...
to identify papers referring to a given object, and can also search by object position, listing papers which concern objects within a 10 arcminute radius of a given Right Ascension
Right ascension
Right ascension is the astronomical term for one of the two coordinates of a point on the celestial sphere when using the equatorial coordinate system. The other coordinate is the declination.-Explanation:...
and Declination
Declination
In astronomy, declination is one of the two coordinates of the equatorial coordinate system, the other being either right ascension or hour angle. Declination in astronomy is comparable to geographic latitude, but projected onto the celestial sphere. Declination is measured in degrees north and...
. These databases combine the many catalogue designations an object might have, so that a search for the Pleiades
Pleiades (star cluster)
In astronomy, the Pleiades, or Seven Sisters , is an open star cluster containing middle-aged hot B-type stars located in the constellation of Taurus. It is among the nearest star clusters to Earth and is the cluster most obvious to the naked eye in the night sky...
will also find papers which list the famous open cluster
Open cluster
An open cluster is a group of up to a few thousand stars that were formed from the same giant molecular cloud and have roughly the same age. More than 1,100 open clusters have been discovered within the Milky Way Galaxy, and many more are thought to exist...
in Taurus
Taurus (constellation)
Taurus is one of the constellations of the zodiac. Its name is a Latin word meaning 'bull', and its astrological symbol is a stylized bull's head:...
under any of its other catalog designations or popular names, such as M45, the Seven Sisters or Melotte 22.
Title and abstract searches
The search engine first filters search terms in several ways. An M followed by a space or hyphenHyphen
The hyphen is a punctuation mark used to join words and to separate syllables of a single word. The use of hyphens is called hyphenation. The hyphen should not be confused with dashes , which are longer and have different uses, or with the minus sign which is also longer...
has the space or hyphen removed, so that searching for Messier catalogue objects is simplified and a user input of M45, M 45 or M-45 all result in the same query being executed; similarly, NGC
New General Catalogue
The New General Catalogue of Nebulae and Clusters of Stars is a well-known catalogue of deep sky objects in astronomy. It contains 7,840 objects, known as the NGC objects...
designations and common search terms such as Shoemaker Levy
Comet Shoemaker-Levy 9
Comet Shoemaker–Levy 9 was a comet that broke apart and collided with Jupiter in July 1994, providing the first direct observation of an extraterrestrial collision of solar system objects. This generated a large amount of coverage in the popular media, and the comet was closely observed by...
and T Tauri
T Tauri star
T Tauri stars are a class of variable stars named after their prototype – T Tauri. They are found near molecular clouds and identified by their optical variability and strong chromospheric lines.-Characteristics:...
are stripped of spaces. Unimportant words such as AT, OR and TO are stripped out, although in some cases case sensitivity
Case sensitivity
Text sometimes exhibits case sensitivity; that is, words can differ in meaning based on differing use of uppercase and lowercase letters. Words with capital letters do not always have the same meaning when written with lowercase letters....
is maintained, so that while and is ignored, And is converted to "Andromeda
Andromeda (constellation)
Andromeda is a constellation in the northern sky. It is named after Andromeda, the princess in the Greek legend of Perseus who was chained to a rock to be eaten by the sea monster Cetus...
e", and Her is converted to "Herculis
Hercules (constellation)
Hercules is a constellation named after Hercules, the Roman mythological hero adapted from the Greek hero Heracles. Hercules was one of the 48 constellations listed by the 2nd century astronomer Ptolemy, and it remains one of the 88 modern constellations today...
", but her is ignored.
Synonym replacement
Once search terms have been pre-processed, the database is queried with the revised search term, as well as synonyms for it. As well as simple synonymSynonym
Synonyms are different words with almost identical or similar meanings. Words that are synonyms are said to be synonymous, and the state of being a synonym is called synonymy. The word comes from Ancient Greek syn and onoma . The words car and automobile are synonyms...
replacement such as searching for both plural
Plural
In linguistics, plurality or [a] plural is a concept of quantity representing a value of more-than-one. Typically applied to nouns, a plural word or marker is used to distinguish a value other than the default quantity of a noun, which is typically one...
and singular
Grammatical number
In linguistics, grammatical number is a grammatical category of nouns, pronouns, and adjective and verb agreement that expresses count distinctions ....
forms, ADS also searches for a large number of specifically astronomical synonyms. For example, spectrograph
Spectrograph
A spectrograph is an instrument that separates an incoming wave into a frequency spectrum. There are several kinds of machines referred to as spectrographs, depending on the precise nature of the waves...
and spectroscope have basically the same meaning, and in an astronomical context metallicity
Metallicity
In astronomy and physical cosmology, the metallicity of an object is the proportion of its matter made up of chemical elements other than hydrogen and helium...
and abundance
Abundance of the chemical elements
The abundance of a chemical element measures how relatively common the element is, or how much of the element is present in a given environment by comparison to all other elements...
are also synonymous. ADS's synonym list was created manually, by grouping the list of words in the database according to similar meanings.
As well as English language
English language
English is a West Germanic language that arose in the Anglo-Saxon kingdoms of England and spread into what was to become south-east Scotland under the influence of the Anglian medieval kingdom of Northumbria...
synonyms, ADS also searches for English translations of foreign search terms and vice versa, so that a search for the French
French language
French is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the regions of Quebec and Acadia in Canada, and by various communities elsewhere. Second-language speakers of French are distributed throughout many parts...
word soleil retrieves references to Sun
Sun
The Sun is the star at the center of the Solar System. It is almost perfectly spherical and consists of hot plasma interwoven with magnetic fields...
, and papers in languages other than English can be returned by English search terms.
Synonym replacement can be disabled if required, so that a rare term which is a synonym of a much more common term (such as 'dateline
Dateline
A dateline is a brief piece of text included in news articles that describes where and when the story occurred, or was written or filed, though the date is often omitted. In the case of articles reprinted from wire services, the distributing organization is also included...
' rather than 'date
Calendar date
A date in a calendar is a reference to a particular day represented within a calendar system. The calendar date allows the specific day to be identified. The number of days between two dates may be calculated. For example, "24 " is ten days after "14 " in the Gregorian calendar. The date of a...
') can be searched for specifically.
Selection logic
The search engine allows selection logicLogic
In philosophy, Logic is the formal systematic study of the principles of valid inference and correct reasoning. Logic is used in most intellectual activities, but is studied primarily in the disciplines of philosophy, mathematics, semantics, and computer science...
both within fields and between fields. Search terms in each field can be combined with OR, AND, simple logic or Boolean logic
Boolean logic
Boolean algebra is a logical calculus of truth values, developed by George Boole in the 1840s. It resembles the algebra of real numbers, but with the numeric operations of multiplication xy, addition x + y, and negation −x replaced by the respective logical operations of...
, and the user can specify which fields must be matched in the search results. This allows complex searches to be built; for example, the user could search for papers concerning NGC 6543 OR NGC 7009, with the paper titles containing (radius OR velocity) AND NOT (abundance OR temperature).
Result filtering
Search results can be filtered according to a number of criteria, including specifying a range of years such as '1945 to 1975', '2000 to the present day' or 'before 1900', and what type of journal the article appears in – non-peer reviewed articles such as conferenceAcademic conference
An academic conference or symposium is a conference for researchers to present and discuss their work. Together with academic or scientific journals, conferences provide an important channel for exchange of information between researchers.-Overview:Conferences are usually composed of various...
proceedings can be excluded or specifically searched for, or specific journals can be included in or excluded from the search.
Search results
Although it was conceived as a means of accessing abstracts and papers, ADS provides a substantial amount of ancillary information along with search results. For each abstract returned, links are provided to other papers in the database which are referenced, and which cite the paper, and a link is provided to a preprint, where one exists. The system also generates a link to 'also-read' articles – that is, those which have been most commonly accessed by those reading the article. In this way, an ADS user can determine which papers are of most interest to astronomers who are interested in the subject of a given paper.Also returned are links to the SIMBAD and/or NASA Extragalactic Database object name databases, via which a user can quickly find out basic observational data about the objects analyzed in a paper, and find further papers on those objects.
Impact on astronomy
ADS is almost universally used as a research tool among astronomers, and there are several studies that have estimated quantitatively how much more efficient ADS has made astronomy; one estimated that ADS increased the efficiency of astronomical research by 333 full-time equivalent research years per year, and another found that in 2002 its effect was equivalent to 736 full-time researchers, or all the astronomical research done in France. ADS has allowed literature searches that would previously have taken days or weeks to carry out to be completed in seconds, and it is estimated that ADS has increased the readership and use of the astronomical literature by a factor of about three since its inception.In monetary terms, this increase in efficiency represents a considerable amount. There are about 12,000 active astronomical researchers worldwide, so ADS is the equivalent of about 5% of the working population of astronomers. The global astronomical research budget is estimated at between 4,000 and 5,000 million USD, so the value of ADS to astronomy would be about 200–250 million USD annually. Its operating budget is a small fraction of this amount.
The great importance of ADS to astronomers has been recognized by the United Nations
United Nations
The United Nations is an international organization whose stated aims are facilitating cooperation in international law, international security, economic development, social progress, human rights, and achievement of world peace...
, the General Assembly
United Nations General Assembly
For two articles dealing with membership in the General Assembly, see:* General Assembly members* General Assembly observersThe United Nations General Assembly is one of the five principal organs of the United Nations and the only one in which all member nations have equal representation...
of which has commended ADS on its work and success, particularly noting its importance to astronomers in the developing world, in reports of the United Nations Committee on the Peaceful Uses of Outer Space
United Nations Committee on the Peaceful Uses of Outer Space
The United Nations Committee on the Peaceful Uses of Outer Space was established in 1958 as an ad hoc committee...
. A 2002 report by a visiting committee to the Center for Astrophysics, meanwhile, said that the service had "revolutionized the use of the astronomical literature", and was "probably the most valuable single contribution to astronomy research that the CfA has made in its lifetime".
Sociological studies using ADS
Because it is used almost universally by astronomers, ADS can reveal much about how astronomical research is distributed around the world. Most users access the system from institutes of higher education, whose IP addressIP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...
can easily be used to determine the user's geographical location. Studies reveal that the highest per-capita users of ADS are France and Netherlands
Netherlands
The Netherlands is a constituent country of the Kingdom of the Netherlands, located mainly in North-West Europe and with several islands in the Caribbean. Mainland Netherlands borders the North Sea to the north and west, Belgium to the south, and Germany to the east, and shares maritime borders...
-based astronomers, and while more developed countries (measured by GDP per capita) use the system more than less developed countries; the relationship between GDP per capita and ADS use is not linear. The range of ADS usage per capita far exceeds the range of GDPs per capita, and basic research carried out in a country, as measured by ADS usage, has been found to be proportional to the square of the country's GDP divided by its population.
ADS usage statistics also suggest that astronomers in more developed countries tend to be more productive than those in less developed countries. The amount of basic research carried out is proportional to the number of astronomers in a country multiplied by the GDP per capita. Statistics also imply that astronomers in Europe
Europe
Europe is, by convention, one of the world's seven continents. Comprising the westernmost peninsula of Eurasia, Europe is generally 'divided' from Asia to its east by the watershed divides of the Ural and Caucasus Mountains, the Ural River, the Caspian and Black Seas, and the waterways connecting...
an cultures carry out about three times as much research as those in Asian culture
Culture of Asia
The culture of Asia is human civilization in Asia. It features different kinds of cultural heritage of many nationalities, societies, and ethnic groups in the region, traditionally called a continent from a Western-centric perspective, of Asia...
s, perhaps suggesting cultural differences in the importance attached to astronomical research.
ADS has also been used to show that the fraction of single-author astronomy papers has decreased substantially since 1975 and that astronomical papers with more than 50 authors have become more common since 1990.
See also
- BibcodeBibcodeThe bibcode is an identifier used by a number of astronomical data systems to specify literature references. The bibcode was developed to be used in SIMBAD and the NASA/IPAC Extragalactic Database , but is now used more widely, for example, in the NASA Astrophysics Data System...
- NASA/IPAC Extragalactic DatabaseNASA/IPAC Extragalactic DatabaseThe NASA/IPAC Extragalactic Database is an on-line astronomical database for astronomers that collates and cross-correlates astronomical information on extragalactic objects...
(NED) - NASANASAThe National Aeronautics and Space Administration is the agency of the United States government that is responsible for the nation's civilian space program and for aeronautics and aerospace research...
Planetary Data SystemPlanetary Data SystemThe Planetary Data System is a distributed data system that NASA uses to archive data collected by Solar System robotic missions and ground-based support data associated with those missions. PDS is managed by NASA Headquarters' Planetary Sciences Division. The PDS is an active archive that makes...
(PDS) - PubMedPubMedPubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine at the National Institutes of Health maintains the database as part of the Entrez information retrieval system...
- SIMBADSIMBADSIMBAD is an astronomical database of objects beyond the Solar System...
- Michael J. KurtzMichael J. KurtzMichael J Kurtz is an astrophysicist at Harvard University, He has held the title of Astronomer at the Harvard-Smithsonian Center for Astrophysics since 1983, and the addition post of Computer Scientist at the Smithsonian Astrophysical Observatory since 1984...
External links
- NASA ADS: Query Form – start your article search here.
- ADS help pages