Concept

Reproducibility & Replicability

Carlos Granell

GEOTEC, Universitat Jaume I

Apr 22, 2024

Today’s reality

  • Computation has an increasing role in scientific research (Stodden and Miguez 2014)

  • Many and diverse computational sciences (bio-informatics, geophysics, material science, fluid mechanics, climate modelling, computational chemistry, … (Barba 2021)

  • As results are increasingly produced by complex computational processes…

…the traditional methods section of a scientific paper is no longer sufficient

The inverse problem

‘Show me’, not ‘trust me’

Show me = help me if you can

“If I say: ‘here’s my work’ and it’s wrong, I might have erred, but at least I am honest”.

Trust me = catch me if you can

“If I publish a paper long on results but short on methods, and it’s wrong, that makes me untrustworthy.”

Definitions

{Re}* terms

  • Reproducible research: Authors provide all the necessary data and the computer codes to run the analysis again, re-creating the results.

  • Reproducibility: A study is reproducible if all of the code and data used to generate the numbers and figures in the paper are available and exactly produce the published results.

  • Replication: A study that arrives at the same scientific findings as another study, collecting new data (possibly with different methods) and completing new analyses.

  • Replicability: A study is replicable if an identical experiment can be performed like the first study and the statistical results are consistent.

  • False discovery: A study is a false discovery if the result presented in the study produces the wrong answer to the question of interest.

Our view

A reproducible paper ensures a reader can recreate the computational workflow of a study, including the prerequisite knowledge and computational environment

  • The former implies the scientific argument to be understandable and sound

  • The latter requires a detailed description of used software and data, and both being openly available

We define reproducibility to mean

computational reproducibility

Course definition

REPRODUCIBILITY involves ORIGINAL data and (computational) methods

REPLICABILITY involves NEW data and/or (computational) methods

Reproducibility spectrum

References

Barba, Lorena A. 2018. “Terminologies for Reproducible Research.” arXiv Preprint arXiv:1802.03311. https://arxiv.org/abs/1802.03311.
———. 2021. Trustworthy Computational Evidence Through Transparency and Reproducibility.” Computing in Science & Engineering 23 (1): 58–64. https://doi.org/10.1109/MCSE.2020.3048406.
Claerbout, JF, and M Karrenbach. 1992. “Electronic Documents Give Reproducible Research a New Meaning.” In SEG Technical Program Expanded Abstracts 1992, 601–4. Society of Exploration Geophysicists. https://doi.org/10.1190/1.1822162.
Donoho, DL, A Maleki, IU Rahman, M Shahram, and V Stodden. 2009. “Reproducible Research in Computational Harmonic Analysis.” Computing in Science & Engineering 11 (1): 8–18. https://doi.org/10.1109/MCSE.2009.15.
Leek, JT, and LR Jager. 2017. “Is Most Published Research Really False?” Annual Review of Statistics and Its Application 4: 109–22. https://doi.org/10.1146/annurev-statistics-060116-054104.
Nüst, Daniel, and Stephen J Eglen. 2021. CODECHECK: an Open Science initiative for the independent execution of computations underlying research articles during peer review to improve reproducibility.” F1000Research 10 (March): 253. https://doi.org/10.12688/f1000research.51738.1.
Nüst, Daniel, C Granell, B Hofer, M Konkol, FO Ostermann, R Sileryte, and V Cerutti. 2018. “Reproducible Research and GIScience: An Evaluation Using AGILE Conference Papers.” PeerJ 6: e5072. https://doi.org/10.7717/peerj.5072.
Ostermann, FO, and C Granell. 2017. “Advancing Science with VGI: Reproducibility and Replicability of Recent Studies Using VGI.” Transactions in GIS 21 (2): 224–37. https://doi.org/10.1111/tgis.12195.
Ostermann, Frank O., Daniel Nüst, Carlos Granell, Barbara Hofer, and Markus Konkol. 2021. Reproducible Research and GIScience: An Evaluation Using GIScience Conference Papers.” In 11th International Conference on Geographic Information Science (GIScience 2021) - Part II, edited by Krzysztof Janowicz and Judith A. Verstegen, 208:2:1–16. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss Dagstuhl – Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.GIScience.2021.II.2.
Peng, RD. 2011. “Reproducible Research in Computational Science.” Science 334 (6060): 1226–27. https://doi.org/10.1126/science.1213847.
Stark, PB. 2018. “Before Reproducibility Must Come Preproducibility.” Nature 557 (7706): 613–14. https://doi.org/10.1038/d41586-018-05256-0.
Stodden, V, and SB Miguez. 2014. “Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research.” Journal of Open Research Software 2 (1): e21. https://doi.org/10.5334/jors.ay.
The Turing Way Community. 2022. “The Turing Way: A Handbook for Reproducible, Ethical and Collaborative Research.” Zenodo. https://doi.org/10.5281/zenodo.3233853.