Difference between revisions of "Discuss what we will consider a GPF"

From Geoscience Paper of the Future
Jump to: navigation, search
(What is a Geoscience Paper of the Future?)
(New Frameworks to Create a New Generation of Scientific Articles)
Line 3: Line 3:
 
= New Frameworks to Create a New Generation of Scientific Articles =
 
= New Frameworks to Create a New Generation of Scientific Articles =
  
Several frameworks have been developed to document scientific articles so that they are more useful to researchers than just a simple PDF.  These include iPython Notebook, Weaver (for R), etc.
+
Several frameworks have been developed to document scientific articles so that they are more useful to researchers than just a simple PDF.  These include iPython Notebook, Weaver (for R), etc.
  
Elsevier has invested in some initiatives in this direction.  They carried out an [http://www.executablepapers.com/about-challenge.html Executable Papers Challenge].  They have a new type of paper called a ''[http://www.elsevier.com/about/content-innovation/original-software-publications#overview software paper]''.
+
Elsevier has invested in some initiatives in this direction.  They carried out an [http://www.executablepapers.com/about-challenge.html Executable Papers Challenge].  They have a new type of paper called a ''[http://www.elsevier.com/about/content-innovation/original-software-publications#overview software paper]''. They also publish ''[http://www.articleofthefuture.com/ articles of the future]'' in different disciplines (see this [[http://www.articleofthefuture.com/S0031018208004690/ paleontology example]), where the figures are interactive, they can be easily downloaded for slide presentations, the citations are hyperlinked, etc.  Those efforts are complementary to what we are trying to do here.
  
 
== The Case of the Tuberculosis Drugome ==
 
== The Case of the Tuberculosis Drugome ==
  
This is a case where a workflow system was used to make data and software explicit and published as linked open data in RDF (i.e., accessible Web objects in the Semantic Web).  The data were assigned DOIs, as was the workflow.
+
This is a case where the work published in a previously published paper was reproduced using a workflow system, where the data and software explicit and published as linked open data in RDF (i.e., accessible Web objects in the Semantic Web).  The data were assigned DOIs, as was the workflow.
  
 
* [http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000976 the original "drugome" paper]
 
* [http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000976 the original "drugome" paper]

Revision as of 15:31, 9 April 2015


New Frameworks to Create a New Generation of Scientific Articles

Several frameworks have been developed to document scientific articles so that they are more useful to researchers than just a simple PDF. These include iPython Notebook, Weaver (for R), etc.

Elsevier has invested in some initiatives in this direction. They carried out an Executable Papers Challenge. They have a new type of paper called a software paper. They also publish articles of the future in different disciplines (see this [paleontology example), where the figures are interactive, they can be easily downloaded for slide presentations, the citations are hyperlinked, etc. Those efforts are complementary to what we are trying to do here.

The Case of the Tuberculosis Drugome

This is a case where the work published in a previously published paper was reproduced using a workflow system, where the data and software explicit and published as linked open data in RDF (i.e., accessible Web objects in the Semantic Web). The data were assigned DOIs, as was the workflow.

Looking at the Future

The Vision

In the future, scientists will use radically new tools to generate papers. As scientists do their work, those tools will be documenting the work and all the associated digital objects (data, software, etc) so that when it comes time to publish a paper everything will be easily documented and included. Today, several research tools exist for working in this way, but they are not routinely used and sometimes they do not always fit the scientist research workflow.

In the future, publishers will accept submissions that do not just contain PDF but also data, software, and other digital objects relevant to the research. Today, many journals accept datasets together with papers, some journals accept software and software papers, but no journal includes the full details of the data, software, workflow, and visualizations of a paper.

In the future, readers of papers will be able to interact with the paper document, modify its figures to explore the data, reproduce the results, run the method with new data. Today, readers simply get a static paper, and even if the data is available they have to download it and analyze it themselves.

In the future, data producers and software developers will get credit for the work that they do because all publications that build on their work will acknowledge their work through citations. Today, there is limited credit and reward for those that create data and software.

What is a Geoscience Paper of the Future?

A paper is one thing (think of a larger wrapper, with conceptual framework) as opposed to the smaller bits (code, datasets, individual figures) that are updated along they way (e.g. get associated with your ORCID). Don't want to get into the stigma of least publishable units. Recognize that there are different types of publications (letter, full paper, etc.) for different sized contributions, too.

A GPF paper includes:

  • data: documented, described in a public repository, has a license specified and is open if possible, and cited with DOIs
  • software: documented, in a public repository, has a license specified and is open source if possible, and cited with DOIs
  • provenance: explicitly documented as a workflow sketch, a formal workflow, or a provenance record (in PROV or similar standard), possibly in a shared repository and given a DOI
  • figures/visualizations: Generated by explicit code (if possible) and are the result of a workflow or provenance record. {Figures may be be a "prettyfied" version of the published version.}

Not all the GPF papers will be able to satisfy all of these things. For example, some collaborators may not want to release data, or some of the software. In those cases, the papers will explain the issues in trying to do these releases, and the challenges that they pose to the future of open reproducible publications.