Skip to main content

Packaging data analytical work reproducibly using R (and friends)

Long considered an axiom of science, the reproducibility of scientific research has recently come under scrutiny after some highly-publicized failures to reproduce results. This has often been linked to the failure of the current model of journal publishing to provide enough details for reviewers to adequately assess the correctness of papers submitted for publication. One early proposal for ameliorating this situation is to bundle the different files that make up a research result into a 'compendium'. At the time it was originally proposed, creating a compendium was a complex process. In this talk I show how modern software tools and services have substantially lightened the burden of making compedia. I describe current approaches to making these compendia to accompany journal articles. Several recent projects of varying sizes are briefly presented to show how my colleagues and I are using R and related tools (e.g. version control, continuous integration, containers, repositories) to make compedia for our publications. I explain how these approaches, which we believe to be widely applicable to many types of research work, subvert the constraints of the typical journal article, and improve the efficiency and reproducibility of our research.