Open Notebook Science
Encyclopedia
Open Notebook Science is the practice of making the entire primary record of a research project publicly available online as it is recorded. This involves placing the personal, or laboratory, notebook of the researcher online along with all raw and processed data, and any associated material, as this material is generated. The approach may be summed up by the slogan 'no insider information'. It is the logical extreme
of transparent approaches to research and explicitly includes the making available of failed, less significant, and otherwise unpublished experiments; so called 'Dark Data'. The practice of Open Notebook Science, although not the norm in the academic
community, has gained significant recent attention in the research, general, and peer-reviewed media as part of a general trend towards more open approaches in research practice and publishing. Open Notebook Science can therefore be described as part of a wider open Science movement that includes the advocacy and adoption of open access publication, open data
, crowdsourcing
data, and citizen science
. It is inspired in part by the success of open-source software
and draws on many of its ideas.
. Bradley described Open Notebook Science as follows
s to obtain detailed descriptions of procedures, raw and analyzed data to either compare with their own work or to build on. Advocates argue that this can improve the communication of science, increase the rate at which research can progress, and reduce time lost due to the repetition of failed experiments. In particular advocates argue that it enables more effective collaboration and enables new forms of collaboration in which the collaborators are not necessarily known in advance.
One of the goals of open notebook science is to "improve scientific communication".
A public laboratory notebook makes it convenient to cite the exact instances of experiments used to support arguments in articles. For example, in a paper on the optimization of a Ugi reaction
, three different batches of product are used in the characterization and each spectrum references the specific experiment where each batch was used: EXP099, EXP203 and EXP206. This work was subsequently published in the Journal of Visualized Experiments
, demonstrating that the integrity data provenance can be maintained from lab notebook to final publication in a peer-reviewed journal.
Without further qualifications, Open Notebook Science implies that the research is being reported on an ongoing basis without unreasonable delay or filter. This enables others to understand exactly how research actually happens within a field or a specific research group. Such information could be of value to collaborators, prospective students or future employers. Providing access to selective notebook pages or inserting an embargo period would be inconsistent with the meaning of the term "Open" in this context. Unless error corrections, failed experiments and ambiguous results are reported, it will not be possible for an outside observer to understand exactly how science is being done. Terms such as Pseudo or Partial have been used as qualifiers for the sharing of laboratory notebook information in a selective way or with a significant delay.
The second argument advanced against Open Notebook Science is that it constitutes prior publication, thus making it impossible to patent or publish the results in the traditional peer reviewed literature. With respect to patents, publication on the web is clearly classified as disclosure
. Therefore, while there may be arguments over the value of patents, and approaches that get around this problem, it is clear that Open Notebook Science is not appropriate for research for which patent protection is an expected and desired outcome. With respect to publication in the peer reviewed literature the case is less clear cut. Most publishers of scientific journals accept material that has previously been presented at a conference or in the form of a preprint. Those publishers that accept material that has been previously published in these forms have generally indicated informally that web publication of data, including Open Notebook Science, falls into this category. However this has not been tested with a wide range of publishers. It is to be expected that those publishers that explicitly exclude these forms of pre-publication will not accept material previously disclosed in an open notebook.
The final argument relates to the problem of the 'data deluge'. If the current volume of the peer reviewed literature is too large for any one person to manage, then how can anyone be expected to cope with the huge quantity of non peer reviewed material that could potentially be available, especially when some, perhaps most, would be of poor quality? A related argument is that 'my notebook is too specific' for it to be of interest to anyone else. The question of how to discover high quality and relevant material is a related issue. The issue of curation and validating data and methodological quality is a serious issue and one that arguably has relevance beyond Open Notebook Science but is a particular challenge here.
, now directed towards reporting solubility measurements in non-aqueous solvent, has received sponsorship from Submeta, Nature and Sigma-Aldrich
. The first of ten winners of the contest for December 2008 was Jenny Hale.
Logical extreme
A logical extreme is a logical construct that is often useful in testing hypotheses. The use of a logical extreme is often the simplest way to disprove a hypothesis. Quite simply, a logical extreme is the statement of an extreme or even preposterous position that is nonetheless consistent with the...
of transparent approaches to research and explicitly includes the making available of failed, less significant, and otherwise unpublished experiments; so called 'Dark Data'. The practice of Open Notebook Science, although not the norm in the academic
Academia
Academia is the community of students and scholars engaged in higher education and research.-Etymology:The word comes from the akademeia in ancient Greece. Outside the city walls of Athens, the gymnasium was made famous by Plato as a center of learning...
community, has gained significant recent attention in the research, general, and peer-reviewed media as part of a general trend towards more open approaches in research practice and publishing. Open Notebook Science can therefore be described as part of a wider open Science movement that includes the advocacy and adoption of open access publication, open data
Open Data
Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. The goals of the open data movement are similar to those of other "Open" movements such as open source, open...
, crowdsourcing
Crowdsourcing
Crowdsourcing is the act of sourcing tasks traditionally performed by specific individuals to a group of people or community through an open call....
data, and citizen science
Citizen science
Citizen science is a term used for the systematic collection and analysis of data; development of technology; testing of natural phenomena; and the dissemination of these activities by researchers on a primarily avocational basis...
. It is inspired in part by the success of open-source software
Open-source software
Open-source software is computer software that is available in source code form: the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, improve and at times also to distribute the software.Open...
and draws on many of its ideas.
History
The term Open Notebook Science was first used in a blog post by Jean-Claude Bradley, an Associate Professor of Chemistry at Drexel UniversityDrexel University
Drexel University is a private research university with the main campus located in Philadelphia, Pennsylvania, USA. It was founded in 1891 by Anthony J. Drexel, a noted financier and philanthropist. Drexel offers 70 full-time undergraduate programs and accelerated degrees...
. Bradley described Open Notebook Science as follows
Experimental
- Jean-Claude Bradley (notebook)
- Andrew S.I.D. LangAndrew S.I.D. LangAndrew Stuart Ian Donald Lang, PNF, FDI is a British mathematical physicist and Professor of Mathematics at Oral Roberts University. He has received a number of awards, including being named a 2010 DaVinci Institute Fellow for his groundbreaking work in virtual worlds...
(notebook) - Cameron Neylon (notebook)
- Open Notebook Science Challenge (notebook)
- Raf Aerts (notebook)
- Alejandro Tamayo (Fruit Computer Laboratory notebook, blog)
- Mike Lawrence (notebook)
- Andy Maloney (notebook), Postdoctoral researcher in Smyth lab at University of Texas. Ph.D. in KochLab at the University of New Mexico (2011, Ph.D notebook, Open PhD dissertation).
- Anthony Salvagno (notebook), Physics Ph.D. student in KochLab at the University of New Mexico.
- Andrés G. Saravia (notebook), Physics Ph.D. student at Cinvestav-Mérida.
Theoretical
- Tobias J. Osborne (notebook)
- Stephen McIntyreStephen McIntyreStephen McIntyre is a Canadian mathematician, former minerals prospector, and semi-retired mining consultant who is best known as the founder and editor of Climate Audit, a blog devoted to the analysis and discussion of climate data...
(notebook on climate change) - Carl Boettiger, Theory and computational modeling in ecology and evolution. (notebook, intro to notebook.)
- Dror Bar-Natan (http://katlas.math.toronto.edu/drorbn/AcademicPensieve/)
Archived
- Jeremiah Faith (notebook archived April 15, 2008)
- Human/Swine A/H1N1 Influenza Origins and Evolution
- Linh Le (Notebook), undergraduate physics major and alumnus of KochLab at the University of New Mexico.
- Brigette Black (notebook), Physics Ph.D. student in KochLab at the University of New Mexico.
Recurrent (Educational)
Partial/Pseudo Open Notebooks
These are initiatives more open than traditional laboratory notebooks but lacking a key component for full Open Notebook Science. Usually either the notebook is only partially shared or shared with significant delay.- Protocolpedia Allows sharing and storage of lab protocols. Also a free iphone app
- Sci-MateSci-MateSci-Mate is an open collaboration of scientists using Web 2.0 software to address well known challenges in academic publishing and technology transfer...
allows users to define access permissions, but can be used as an open notebook tool. - Vinod Scaria (notebook, needs login)
- OpenWetWare (hosts many laboratories and allows for selective sharing of information related to each research group)
- Caleb Morse (notebook)
- Gus Rosania (notebook)
- Antony Garrett LisiAntony Garrett LisiAntony Garrett Lisi , who uses the name Garrett by preference, is an American theoretical physicist and adventure sports enthusiast. Lisi works as an independent researcher without an academic position...
(notebook) - Rosie Redfield (research blog), microbiologist at the University of British Columbia; all results discussed but raw experimental notebook is not exposed.
- Martin Johnson (notebook), marine chemist at East Anglia University. (Selective Content, Immediate Sharing)
- Greg Lang (notebooks), Post doc in David Botstein's lab at Princeton University. (All Content, Delayed Sharing) - shared on approximately a weekly basis
Benefits of Open Notebook Science
The aim of Open Notebook Science is to make the full record of scientific research available. This enables other scientistScientist
A scientist in a broad sense is one engaging in a systematic activity to acquire knowledge. In a more restricted sense, a scientist is an individual who uses the scientific method. The person may be an expert in one or more areas of science. This article focuses on the more restricted use of the word...
s to obtain detailed descriptions of procedures, raw and analyzed data to either compare with their own work or to build on. Advocates argue that this can improve the communication of science, increase the rate at which research can progress, and reduce time lost due to the repetition of failed experiments. In particular advocates argue that it enables more effective collaboration and enables new forms of collaboration in which the collaborators are not necessarily known in advance.
One of the goals of open notebook science is to "improve scientific communication".
A public laboratory notebook makes it convenient to cite the exact instances of experiments used to support arguments in articles. For example, in a paper on the optimization of a Ugi reaction
Ugi reaction
The Ugi reaction is a multi-component reaction in organic chemistry involving a ketone or aldehyde, an amine, an isocyanide and a carboxylic acid to form a bis-amide.The reaction is named after Ivar Karl Ugi, who first published this reaction in 1959....
, three different batches of product are used in the characterization and each spectrum references the specific experiment where each batch was used: EXP099, EXP203 and EXP206. This work was subsequently published in the Journal of Visualized Experiments
Journal of Visualized Experiments
The Journal of Visualized Experiments is a peer-reviewed scientific journal that was established in December 2006. The editor-in-chief is Moshe Pritsker, who is also the CEO and co-founder. The focus of the journal is publishing biological research in video format...
, demonstrating that the integrity data provenance can be maintained from lab notebook to final publication in a peer-reviewed journal.
Without further qualifications, Open Notebook Science implies that the research is being reported on an ongoing basis without unreasonable delay or filter. This enables others to understand exactly how research actually happens within a field or a specific research group. Such information could be of value to collaborators, prospective students or future employers. Providing access to selective notebook pages or inserting an embargo period would be inconsistent with the meaning of the term "Open" in this context. Unless error corrections, failed experiments and ambiguous results are reported, it will not be possible for an outside observer to understand exactly how science is being done. Terms such as Pseudo or Partial have been used as qualifiers for the sharing of laboratory notebook information in a selective way or with a significant delay.
Arguments against Open Notebook Science
The arguments against adopting Open Notebook Science fall mainly into three categories which have differing importance in different fields of science. The primary concern, expressed particularly by biological and medical scientists is that of 'data theft' or 'being scooped'. While the degree to which research groups steal or adapt the results of others remains a subject of debate it is certainly the case that the fear of not being first to publish drives much behaviour, particularly in some fields. This is related to the focus in these fields on the published peer reviewed paper as being the main metric of career success.The second argument advanced against Open Notebook Science is that it constitutes prior publication, thus making it impossible to patent or publish the results in the traditional peer reviewed literature. With respect to patents, publication on the web is clearly classified as disclosure
Patent infringement
Patent infringement is the commission of a prohibited act with respect to a patented invention without permission from the patent holder. Permission may typically be granted in the form of a license. The definition of patent infringement may vary by jurisdiction, but it typically includes using or...
. Therefore, while there may be arguments over the value of patents, and approaches that get around this problem, it is clear that Open Notebook Science is not appropriate for research for which patent protection is an expected and desired outcome. With respect to publication in the peer reviewed literature the case is less clear cut. Most publishers of scientific journals accept material that has previously been presented at a conference or in the form of a preprint. Those publishers that accept material that has been previously published in these forms have generally indicated informally that web publication of data, including Open Notebook Science, falls into this category. However this has not been tested with a wide range of publishers. It is to be expected that those publishers that explicitly exclude these forms of pre-publication will not accept material previously disclosed in an open notebook.
The final argument relates to the problem of the 'data deluge'. If the current volume of the peer reviewed literature is too large for any one person to manage, then how can anyone be expected to cope with the huge quantity of non peer reviewed material that could potentially be available, especially when some, perhaps most, would be of poor quality? A related argument is that 'my notebook is too specific' for it to be of interest to anyone else. The question of how to discover high quality and relevant material is a related issue. The issue of curation and validating data and methodological quality is a serious issue and one that arguably has relevance beyond Open Notebook Science but is a particular challenge here.
Funding and Sponsorship
The Open Notebook Science ChallengeOpen Notebook Science Challenge
The Open Notebook Science Challenge is a crowdsourcing research project which collects measurements of the non-aqueous solubility of organic compounds and publishes these as open data; findings are reported in an open notebook science manner...
, now directed towards reporting solubility measurements in non-aqueous solvent, has received sponsorship from Submeta, Nature and Sigma-Aldrich
Sigma-Aldrich
Sigma-Aldrich Corporation , is a life science and high technology company with over 7,600 employees and operations in 40 countries. Its chemical and biochemical products and kits are used in scientific research, biotechnology, pharmaceutical development, the diagnosis of disease, and as key...
. The first of ten winners of the contest for December 2008 was Jenny Hale.
Logos
Logos can be used on Notebooks to indicate the conditions of sharing. Fully Open Notebooks are marked as "All Content" and "Immediate" access. Partially Open Notebooks can be marked as either "Selected Content" and/or "Delayed".See also
- Open access (publishing)
- Open dataOpen DataOpen data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. The goals of the open data movement are similar to those of other "Open" movements such as open source, open...
- Open researchOpen researchOpen research is research conducted in the spirit of free and open source software. Much like open source schemes that are built around a source code that is made public, the central theme of open research is to make clear accounts of the methodology freely available via the internet, along with...
- Open contentOpen contentOpen content or OpenContent is a neologism coined by David Wiley in 1998 which describes a creative work that others can copy or modify. The term evokes open source, which is a related concept in software....
- Open sourceOpen sourceThe term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...