ChemSpider
Encyclopedia
ChemsSpider is a free chemical
database
, owned by the Royal Society of Chemistry
.
s from over 400 data sources including those listed below.
approach to develop an online chemistry database. Crowdsourced based curation of the data has produced a dictionary
of chemical names associated with chemical structures that has been used in text-mining applications of the biomedical and chemical literature .
as the basis of chemistry document markup
. ChemMantis, the Chemistry Markup And Nomenclature Transformation Integrated System uses algorithms to identify and extract chemical names from documents and web pages and converts the chemical names to chemical structures using name-to-structure conversion algorithms and dictionary
look-ups in the ChemSpider database. The result is an integrated system between chemistry documents and information look-up via ChemSpider into over 150 data sources.
in May, 2009. Prior to the acquisition by RSC, ChemSpider was controlled by a private corporation, ChemZoo Inc. The system was first launched in March 2007 in a beta release form and transitioned to release in March 2008. ChemSpider has expanded the generic support of a chemistry database to include support of the Wikipedia
chemical structure collection via their WiChempedia implementation.
s, the generation of SMILES and InChI strings as well as the prediction of many physicochemical parameters and integration to a web service allowing NMR prediction. The organization is working with RSC to develop a hash table
resolver for InChIKeys, shorter hashed forms of InChIs.
. Open PHACTS will deploy a highly innovative open standards, open access, semantic web approach to address key bottlenecks in small molecule drug discovery - disparate information sources, lack of standards and information overload.
Chemistry
Chemistry is the science of matter, especially its chemical reactions, but also its composition, structure and properties. Chemistry is concerned with atoms and their interactions with other atoms, and particularly with the properties of chemical bonds....
database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
, owned by the Royal Society of Chemistry
Royal Society of Chemistry
The Royal Society of Chemistry is a learned society in the United Kingdom with the goal of "advancing the chemical sciences." It was formed in 1980 from the merger of the Chemical Society, the Royal Institute of Chemistry, the Faraday Society and the Society for Analytical Chemistry with a new...
.
Database
The database contains more than 26 million unique moleculeMolecule
A molecule is an electrically neutral group of at least two atoms held together by covalent chemical bonds. Molecules are distinguished from ions by their electrical charge...
s from over 400 data sources including those listed below.
- A-L: EPA DSSTox, U.S. Food and Drug Administration (FDA), Human Metabolome Database, Journal of Heterocyclic ChemistryJournal of Heterocyclic ChemistryJournal of Heterocyclic Chemistry is a peer-reviewed scientific journal summarizing progress in the field of heterocycle chemistry. It is a source for the ChemSpider database....
, KEGG, KUMGM, LeadScope, LipidMAPS - M-N: Marinlit, MDPIMDPIMDPI is an initialism shared by two related scholarly organizations based in Basel, Switzerland:* Molecular Diversity Preservation International is an organization for deposit and exchange of molecular and biomolecular samples, founded and registered in 1996, operating as MDPI Verein...
, MICADMICAD- :The Molecular Imaging and Contrast Agent Database is a freely accessible online source of information on in vivo molecular imaging agents...
, MLSMR, MMDB, MOLIMoliMoli is a village development committee in Okhaldhunga District in the Sagarmatha Zone of mid-eastern Nepal. At the time of the 1991 Nepal census it had a population of 3447 living in 597 individual households.-External links:*...
, MTDP, Nanogen, Nature Chemical BiologyNature Chemical BiologyNature Chemical Biology is a monthly, peer-reviewed, scientific journal, which is published by Nature Publishing Group. It was first published in June of 2005 . Terry L. Sheppard is a full-time professional editor with the title, "Chief Editor", and employed by Nature Chemical Biology...
, NCGC, NIAID, NIH/NLM, NINDS Approved Drug Screening Program, NIST, NIST Chemistry WebBook, NMMLSC, NMRShiftDB - P-S: PANACHEPanachePanache is a word of French origin that carries the connotation of a flamboyant manner and reckless courage.The literal translation is a plume, such as is worn on a hat or a helmet, but the reference is to King Henry IV of France...
, PCMD, PDSP, Peptides, Prous Science Drugs of the Future, QSAR, R&D Chemicals, San Diego Center for Chemical Genomics, SGCOxCompounds, SGCStoCompounds, SMID, Specs, Structural Genomics ConsortiumStructural Genomics ConsortiumThe Structural Genomics Consortium is a not-for-profit organization formed in 2004 to determine the three dimensional structures of proteins of medical relevance, and place them in the Protein Data Bank without restriction...
, SureChem, Synthon-Lab - T-Z: Thomson Pharma, Total TOSLab Building-Blocks, UM-BBD, UPCMLD, UsefulChem, Web of ScienceWeb of ScienceISI Web of Knowledge is an academic citation indexing and search service, which is combined with web linking and provided by Thomson Reuters. Web of Knowledge coverage encompasses the sciences, social sciences, arts and humanities. It provides bibliographic content and the tools to access, analyze,...
, xPharm, ZINCZincZinc , or spelter , is a metallic chemical element; it has the symbol Zn and atomic number 30. It is the first element in group 12 of the periodic table. Zinc is, in some respects, chemically similar to magnesium, because its ion is of similar size and its only common oxidation state is +2...
Crowdsourcing
The ChemSpider database can be updated with user contributions including chemical structure deposition, spectra deposition and user curation. This is a crowdsourcingCrowdsourcing
Crowdsourcing is the act of sourcing tasks traditionally performed by specific individuals to a group of people or community through an open call....
approach to develop an online chemistry database. Crowdsourced based curation of the data has produced a dictionary
Dictionary
A dictionary is a collection of words in one or more specific languages, often listed alphabetically, with usage information, definitions, etymologies, phonetics, pronunciations, and other information; or a book of words in one language with their equivalents in another, also known as a lexicon...
of chemical names associated with chemical structures that has been used in text-mining applications of the biomedical and chemical literature .
Searching
A number of available search modules are provided:- The standard search allows queryingInformation retrievalInformation retrieval is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web...
for systematic names, trade names and synonyms and registry numbers - The advanced search allows interactive searching by chemical structure, chemical substructure, using also molecular formula and molecular weight range, CASChemical Abstracts ServiceChemical Abstracts is a periodical index that provides summaries and indexes of disclosures in recently published scientific documents. Approximately 8,000 journals, technical reports, dissertations, conference proceedings, and new books, in any of 50 languages, are monitored yearly, as are patent...
numbers, suppliers, etc. The search can be used to widen or restrict already found results.
Chemistry document mark-up
The ChemSpider database has been used in combination with text miningText mining
Text mining, sometimes alternately referred to as text data mining, roughly equivalent to text analytics, refers to the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as...
as the basis of chemistry document markup
Markup language
A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...
. ChemMantis, the Chemistry Markup And Nomenclature Transformation Integrated System uses algorithms to identify and extract chemical names from documents and web pages and converts the chemical names to chemical structures using name-to-structure conversion algorithms and dictionary
Dictionary
A dictionary is a collection of words in one or more specific languages, often listed alphabetically, with usage information, definitions, etymologies, phonetics, pronunciations, and other information; or a book of words in one language with their equivalents in another, also known as a lexicon...
look-ups in the ChemSpider database. The result is an integrated system between chemistry documents and information look-up via ChemSpider into over 150 data sources.
History
ChemSpider was acquired by the Royal Society of ChemistryRoyal Society of Chemistry
The Royal Society of Chemistry is a learned society in the United Kingdom with the goal of "advancing the chemical sciences." It was formed in 1980 from the merger of the Chemical Society, the Royal Institute of Chemistry, the Faraday Society and the Society for Analytical Chemistry with a new...
in May, 2009. Prior to the acquisition by RSC, ChemSpider was controlled by a private corporation, ChemZoo Inc. The system was first launched in March 2007 in a beta release form and transitioned to release in March 2008. ChemSpider has expanded the generic support of a chemistry database to include support of the Wikipedia
Wikipedia
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Its 20 million articles have been written collaboratively by volunteers around the world. Almost all of its articles can be edited by anyone with access to the site,...
chemical structure collection via their WiChempedia implementation.
Services
A number of services are made available online. These include the conversion of chemical names to chemical structureChemical structure
A chemical structure includes molecular geometry, electronic structure and crystal structure of molecules. Molecular geometry refers to the spatial arrangement of atoms in a molecule and the chemical bonds that hold the atoms together. Molecular geometry can range from the very simple, such as...
s, the generation of SMILES and InChI strings as well as the prediction of many physicochemical parameters and integration to a web service allowing NMR prediction. The organization is working with RSC to develop a hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...
resolver for InChIKeys, shorter hashed forms of InChIs.
Open PHACTS
ChemSpider is serving as the chemical compound repository as part of the Open PHACTS project, an Innovative Medicines InitiativeInnovative Medicines Initiative
The Innovative Medicines Initiative is a European initiative to improve the competitive situation of the European Union in the field of pharmaceutical research...
. Open PHACTS will deploy a highly innovative open standards, open access, semantic web approach to address key bottlenecks in small molecule drug discovery - disparate information sources, lack of standards and information overload.
See also
- eMoleculesEMoleculeseMolecules is a search engine for chemical molecules. The system was first launched in November 2005.-Database:* The database contains more than 7.0M unique molecules from commercial suppliers, like Acros, ASINEX, ChemBridge, ChemDiv, Comgenex, Enamine Ltd, Fluka, InterBioScreen, Key Organics, Life...
- NIST
- PubChemPubChemPubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information , a component of the National Library of Medicine, which is part of the United States National Institutes of Health . PubChem can...
- DrugBankDrugBankThe DrugBank database available at the University of Alberta is a bioinformatics and cheminformatics resource that combines detailed drug data with comprehensive drug target information...
- ChEBIChEBIChemical Entities of Biological Interest, also known as ChEBI, is a database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies effort...
- ChEMBLChEMBLChEMBL or ChEMBLdb is a manually curated chemical database of bioactive molecules with drug-like properties.It is maintained by the European Bioinformatics Institute , based on the Wellcome Trust Genome Campus, Hinxton, UK. The database, originally known as StARlite, was developed by a...
- Software for molecular modeling
Further reading
- E. Curry, A. Freitas, and S. O’Riáin, “The Role of Community-Driven Data Curation for Enterprises,” in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25-47.